**Stefan Kiefer Christine Tasson (Eds.)**

# **Foundations of Software Science and Computation Structures**

**24th International Conference, FOSSACS 2021 Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021 Luxembourg City, Luxembourg, March 27 – April 1, 2021 Proceedings**

# Lecture Notes in Computer Science 12650

Founding Editors

Gerhard Goos, Germany Juris Hartmanis, USA

### Editorial Board Members

Elisa Bertino, USA Wen Gao, China Bernhard Steffen , Germany Gerhard Woeginger , Germany Moti Yung, USA

# Advanced Research in Computing and Software Science Subline of Lecture Notes in Computer Science

Subline Series Editors

Giorgio Ausiello, University of Rome 'La Sapienza', Italy Vladimiro Sassone, University of Southampton, UK

Subline Advisory Board

Susanne Albers, TU Munich, Germany Benjamin C. Pierce, University of Pennsylvania, USA Bernhard Steffen , University of Dortmund, Germany Deng Xiaotie, Peking University, Beijing, China Jeannette M. Wing, Microsoft Research, Redmond, WA, USA More information about this subseries at http://www.springer.com/series/7407

# Foundations of Software Science and Computation Structures

24th International Conference, FOSSACS 2021 Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021 Luxembourg City, Luxembourg, March 27 – April 1, 2021 Proceedings

Editors Stefan Kiefer University of Oxford Oxford, UK

Christine Tasson Sorbonne Université - LIP6 Paris, France

ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notes in Computer Science ISBN 978-3-030-71994-4 ISBN 978-3-030-71995-1 (eBook) https://doi.org/10.1007/978-3-030-71995-1

LNCS Sublibrary: SL1 – Theoretical Computer Science and General Issues

© The Editor(s) (if applicable) and The Author(s) 2021. This book is an open access publication.

Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

### ETAPS Foreword

Welcome to the 24th ETAPS! ETAPS 2021 was originally planned to take place in Luxembourg in its beautiful capital Luxembourg City. Because of the Covid-19 pandemic, this was changed to an online event.

ETAPS 2021 was the 24th instance of the European Joint Conferences on Theory and Practice of Software. ETAPS is an annual federated conference established in 1998, and consists of four conferences: ESOP, FASE, FoSSaCS, and TACAS. Each conference has its own Program Committee (PC) and its own Steering Committee (SC). The conferences cover various aspects of software systems, ranging from theoretical computer science to foundations of programming languages, analysis tools, and formal approaches to software engineering. Organising these conferences in a coherent, highly synchronised conference programme enables researchers to participate in an exciting event, having the possibility to meet many colleagues working in different directions in the field, and to easily attend talks of different conferences. On the weekend before the main conference, numerous satellite workshops take place that attract many researchers from all over the globe.

ETAPS 2021 received 260 submissions in total, 115 of which were accepted, yielding an overall acceptance rate of 44.2%. I thank all the authors for their interest in ETAPS, all the reviewers for their reviewing efforts, the PC members for their contributions, and in particular the PC (co-)chairs for their hard work in running this entire intensive process. Last but not least, my congratulations to all authors of the accepted papers!

ETAPS 2021 featured the unifying invited speakers Scott Smolka (Stony Brook University) and Jane Hillston (University of Edinburgh) and the conference-specific invited speakers Işil Dillig (University of Texas at Austin) for ESOP and Willem Visser (Stellenbosch University) for FASE. Inivited tutorials were provided by Erika Ábrahám (RWTH Aachen University) on analysis of hybrid systems and Madhusudan Parthasararathy (University of Illinois at Urbana-Champaign) on combining machine learning and formal methods.

ETAPS 2021 was originally supposed to take place in Luxembourg City, Luxembourg organized by the SnT - Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg. University of Luxembourg was founded in 2003. The university is one of the best and most international young universities with 6,700 students from 129 countries and 1,331 academics from all over the globe. The local organisation team consisted of Peter Y.A. Ryan (general chair), Peter B. Roenne (organisation chair), Joaquin Garcia-Alfaro (workshop chair), Magali Martin (event manager), David Mestel (publicity chair), and Alfredo Rial (local proceedings chair).

ETAPS 2021 was further supported by the following associations and societies: ETAPS e.V., EATCS (European Association for Theoretical Computer Science), EAPLS (European Association for Programming Languages and Systems), and EASST (European Association of Software Science and Technology).

The ETAPS Steering Committee consists of an Executive Board, and representatives of the individual ETAPS conferences, as well as representatives of EATCS, EAPLS, and EASST. The Executive Board consists of Holger Hermanns (Saarbrücken), Marieke Huisman (Twente, chair), Jan Kofron (Prague), Barbara König (Duisburg), Gerald Lüttgen (Bamberg), Caterina Urban (INRIA), Tarmo Uustalu (Reykjavik and Tallinn), and Lenore Zuck (Chicago).

Other members of the steering committee are: Patricia Bouyer (Paris), Einar Broch Johnsen (Oslo), Dana Fisman (Be'er Sheva), Jan-Friso Groote (Eindhoven), Esther Guerra (Madrid), Reiko Heckel (Leicester), Joost-Pieter Katoen (Aachen and Twente), Stefan Kiefer (Oxford), Fabrice Kordon (Paris), Jan Křetínský (Munich), Kim G. Larsen (Aalborg), Tiziana Margaria (Limerick), Andrew M. Pitts (Cambridge), Grigore Roșu (Illinois), Peter Ryan (Luxembourg), Don Sannella (Edinburgh), Lutz Schröder (Erlangen), Ilya Sergey (Singapore), Mariëlle Stoelinga (Twente), Gabriele Taentzer (Marburg), Christine Tasson (Paris), Peter Thiemann (Freiburg), Jan Vitek (Prague), Anton Wijs (Eindhoven), Manuel Wimmer (Linz), and Nobuko Yoshida (London).

I'd like to take this opportunity to thank all the authors, attendees, organizers of the satellite workshops, and Springer-Verlag GmbH for their support. I hope you all enjoyed ETAPS 2021.

Finally, a big thanks to Peter, Peter, Magali and their local organisation team for all their enormous efforts to make ETAPS a fantastic online event. I hope there will be a next opportunity to host ETAPS in Luxembourg.

February 2021 Marieke Huisman ETAPS SC Chair ETAPS e.V. President

### Preface

This volume contains the papers accepted for the 24th International Conference on Foundations of Software Science and Computation Structures (FoSSaCS). The conference series is dedicated to foundational research with a clear significance for software science. It brings together research on theories and methods to support the analysis, integration, synthesis, transformation, and verification of programs and software systems.

This volume contains 28 contributed papers selected from 88 paper submissions. Each submission was reviewed by at least three Program Committee members, with the help of external reviewers, and the final decisions took into account the feedback from a rebuttal phase. The conference submissions were managed using the EasyChair conference system, which was also used to assist with the compilation of these proceedings.

We wish to thank all the authors who submitted papers to FoSSaCS 2021, the Program Committee members, the Steering Committee members, the external reviewers, and the ETAPS 2021 organizers. Due to the Covid-19 pandemic, ETAPS 2021 was held online.

July 2021 Stefan Kiefer Christine Tasson

### Organization

#### Program Committee

Sandra Alves University of Porto Zena M. Ariola University of Oregon Giorgio Bacci Aalborg University Nathalie Bertrand Inria Véronique Bruyère University of Mons Jacques Garrigue Nagoya University Piotr Hofman University of Warsaw Stefan Kiefer University of Oxford Dexter Kozen Cornell University Sebastian Maneth Universität Bremen Samuel Mimram École Polytechnique Peter Selinger Dalhousie University Mahsa Shirmohammadi CNRS Filip Sieczkowski University of Wrocław Jeremy Sproston University of Turin Thomas Streicher TU Darmstadt Christine Tasson Sorbonne Université

Dmitry Chistikov The University of Warwick Ugo Dal Lago Università di Bologna & Inria Sophia Antipolis Valeria de Paiva Santa Clara University and Topos Institute Robert Harper Carnegie Mellon University Giulio Manzonetto Université Sorbonne Paris Nord Nikos Tzevelekos Queen Mary University of London Rob van Glabbeek Data61 - CSIRO

### Additional Reviewers

Alcolei, Aurore Amy, Matthew Angiuli, Carlo Avanzini, Martin Barenbaum, Pablo Behr, Nicolas Biernacki, Dariusz Bjorndahl, Adam Blumensath, Achim Bollig, Benedikt Breuvart, Flavien

Brookes, Steve Carbone, Marco Christoff, Zoé Clemente, Lorenzo Colcombet, Thomas Czerwiński, Wojciech de Frutos-Escrig de Liguoro, Ugo de Visme, Marc Demangeon, Romain Deng, Yuxin

Derakhshan, Farzaneh Din, Crystal Chang Dixon, Alex Doyen, Laurent Dubut, Jérémy Duncan, Ross Echahed, Rachid Enea, Constantin Espírito Santo, José Exibard, Léo Fahrenberg, Uli Falcone, Yliès Fijalkow, Nathanaël Finster, Eric Flesch, János Fortin, Marie Frey, Jonas Garner, Richard Gavazzo, Francesco Gay, Simon Geoffroy, Guillaume Giacobazzi, Roberto Graham-Lengrand, Stéphane Gruber, Hermann Guerrieri, Giulio Gusev, Vladimir Haase, Christoph Heijltjes, Willem Hirschkoff, Daniel Hirschowitz, Tom Hoffmann, Jan Hou, Zhe Husfeldt, Thore Jansen, David N. Kaminski, Benjamin Lucien Kerjean, Marie Klin, Bartek Kop, Cynthia Kopczynski, Eryk Koutavas, Vasileios Kura, Satoshi Kuske, Dietrich König, Barbara Laarman, Alfons Lahav, Ori Laird, James

Lam, Vitus Laroussinie, Francois Lasota, Sławomir Lazić, Ranko Le Roux, Stéphane Lefaucheux, Engel Leroux, Jérôme Levy, Jordi Lin, Yu-Yang Maillard, Kenji Malacaria, Pasquale Malbos, Philippe Maletti, Andreas Mansfield, Shane Mardare, Radu Marin, Sonia Markey, Nicolas Maruyama, Yoshihiro Maschio, Samuele Matthes, Ralph Mayr, Richard Mazowiecki, Filip Mazza, Damiano McCusker, Guy Michaux, Christian Mikulski, Lukasz Milius, Stefan Moerman, Joshua Moreira, Nelma Muscholl, Anca Møgelberg, Rasmus Ejlers Najib, Muhammad Nanevski, Aleksandar Nordvall Forsberg, Fredrik O'Connor, Liam Padovani, Luca Parys, Paweł Pasquali, Fabio Pattathurajan, Mohnish Patterson, Daniel Pérez, Guillermo Pfenning, Frank Piedeleu, Robin Pimentel, Elaine Piróg, Maciej Pistone, Paolo

Pitts, Andrew Pous, Damien Przybyłko, Marcin Puppis, Gabriele Purser, David Péchoux, Romain Quaas, Karin Raskin, Jean-François Riba, Colin Rowe, Reuben Ryzhikov, Andrew Saikawa, Takafumi Sammartino, Matteo Sangnier, Arnaud Schmitz, Sylvain Schöpp, Ulrich Seely, Robert Sobocinski, Pawel Sofronie-Stokkermans, Viorica Soudjani, Sadegh Sprunger, David

Srba, Jiri Stefanescu, Gheorghe Sterling, Jonathan Studer, Thomas Tamm, Hellis Tan, Tony Tang, Qiyi Tini, Simone Totzke, Patrick Tretmans, Jan Trotta, Davide Van der Merwe, Brink van Dijk, Tom Virtema, Jonni Vizel, Yakir Wang, Xinyu Winkler, Tobias Winter, Sarah Wojtczak, Dominik Zanasi, Fabio Zeilberger, Noam

### Contents




### **Constructing a universe for the setoid model**

Thorsten Altenkirch<sup>1</sup> <sup>∗</sup> -, Simon Boulier<sup>2</sup>†, Ambrus Kaposi<sup>3</sup> ‡, Christian Sattler<sup>4</sup>§, and Filippo Sestini<sup>1</sup>

<sup>1</sup> School of Computer Science, University of Nottingham, Nottingham, UK

{psztxa,psxfs5}@nottingham.ac.uk <sup>2</sup> Inria, Nantes, France simon.boulier@inria.fr

<sup>3</sup> E¨otv¨os Lor´and University, Budapest, Hungary akaposi@inf.elte.hu

<sup>4</sup> Chalmers University of Technology, Gothenburg, Sweden sattler@chalmers.se

**Abstract.** The setoid model is a model of intensional type theory that validates certain extensionality principles, like function extensionality and propositional extensionality, the latter being a limited form of univalence that equates logically equivalent propositions. The appeal of this model construction is that it can be constructed in a small, intensional, type theoretic metatheory, therefore giving a method to boostrap extensionality. The setoid model has been recently adapted into a formal system, namely Setoid Type Theory (SeTT). SeTT is an extension of intensional Martin-L¨of type theory with constructs that give full access to the extensionality principles that hold in the setoid model.

Although already a rich theory as currently defined, SeTT currently lacks a way to internalize the notion of type beyond propositions, hence we want to extend SeTT with a universe of setoids. To this aim, we present the construction of a (non-univalent) universe of setoids within the setoid model, first as an inductive-recursive definition, which is then translated to an inductive-inductive definition and finally to an inductive family. These translations from more powerful definition schemas to simpler ones ensure that our construction can still be defined in a relatively small metatheory which includes a proof-irrelevant identity type with a strong transport rule.

**Keywords:** type theory · function extensionality · univalence · setoid model · induction-recursion · induction-induction

<sup>∗</sup> Supported by USAF grant FA9550-16-1-0029.

<sup>†</sup> Supported by ERC Starting Grant CoqHoTT 637339.

<sup>‡</sup> Supported by the Bolyai Fellowship of the Hungarian Academy of Sciences (BO/00659/19/3) and by the "Application Domain Specific Highly Reliable IT Solutions" project that has been implemented with support from the National Research, Development and Innovation Fund of Hungary, financed under the Thematic Excellence Programme TKP2020-NKA-06 funding scheme.

<sup>§</sup> Supported by USAF grant FA9550-16-1-0029 and Swedish Research Council grant 2019-03765.

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 1–21, 2021.

https://doi.org/10.1007/978-3-030-71995-1 1

#### **1 Introduction**

Intuitionistic type theory is a formal system designed by Per Martin-L¨of to be a full-fledged foundation in which to develop constructive mathematics [23,24]. A central aspect of type theory is the coexistence of two notions of equality. On the one hand definitional equality, the computational equality that is built into the formalism. On the other hand "propositional" equality, the internal notion of equality that is actually used to state and prove equational theorems within the system. The precise balance between these two notions is at the center of type theory research; however, it is generally understood that to properly support formalization of mathematics, one should aim for a notion of propositional equality that is as *extensional* as possible.

Two extensionality principles seem particularly desirable, since they arguably constitute the bare minimum for type theory to be comparable to set theory as a foundational system for set-level mathematics, in terms of power and ergonomics. One is function extensionality (or *funext*), according to which functions are equal if point-wise equal. Another is propositional extensionality (or *propext*), that equates all propositions that are logically equivalent.

Type theory with equality reflection, also known as *extensional type theory* (ETT) does support extensional reasoning to some degree, but unfortunately equality reflection makes the problem of type-checking ETT terms computationally unfeasible: it is undecidable.

On the other hand, *intensional type theory* (ITT) has nice computational properties like decidable type checking that can make it more suitable for computer implementation, but as usually defined (for example, in [23]) it severely lacks extensionality. It is known from model constructions that extensional principles like funext are consistent with ITT. Moreover, ITT extended with the principle of *uniqueness of identity proofs* (UIP) and funext is known to be as powerful as ETT [19]. We could recover the expressive power of ETT by adding these principles to ITT as axioms, however destroying some computational properties like canonicity.

What we would like instead is a formulation of ITT that supports extensionality, while retaining its convenient computational behaviour. Unfortunately, canonicity for Martin-L¨of's inductively defined identity type says that if two terms are propositionally equal in the empty context, then they are also definitionally equal. This rules out function extensionality. The first step towards a solution is to give up the idea of propositional equality as a single inductive definition given generically for arbitrary types. Instead, equality should be *specific* to each type former in the type theory, or in other words, every type former should be introduced alongside an explanation of what counts as equality for its elements.

This idea of pairing types together with their own equality relation goes back to the notion of *setoid* or *Bishop set*. Setoids provide a quite natural and useful semantic domain in which to interpret type theory. The first setoid model was constructed to justify function extensionality without relying on funext in the metatheory [18]. Moreover, it was shown by Altenkirch [4] that if the model construction is carried out in a type theoretic metatheory with a universe of strict (definitionally proof-irrelevant) propositions, it is possible to define a univalent universe of propositions satisfying propositional extensionality. The setoid model thus satisfies all the extensionality principles that we would like to have in a setlevel type theory <sup>5</sup> . The question is whether there exists a version of intensional type theory that supports setoid reasoning, and hence the forms of extensionality enabled by it.

This question was revisited and answered in Altenkirch et al. [5]. In this paper, the authors define Setoid Type Theory (SeTT), an extension of intensional Martin-L¨of type theory with constructs for setoid reasoning, where funext and propext hold by definition. SeTT is based on the *strict* setoid model of Altenkirch<sup>6</sup>, which makes it possible to show consistency via a syntactic translation. This is in contrast with other type theories based on the setoid model, like Observational Type Theory [9] and XTT [28], which instead rely on ETT for their justification. A major property of SeTT is thus to illustrate how to bootstrap extensionality, by translation into a small intensional core.

SeTT as defined in [5] is already a rich theory, but its introspection capabilities are currently lacking, as its universes are limited to propositions. We would like to internalise the notion of type in SeTT, thus extending the theory with a universe of setoids. This goal brings up several questions, one of which has to do with the notion of equality with which the universe should come equipped: the universe of setoids is itself a setoid (as any type is) so it certainly cannot be univalent, since setoids lack the necessary structure. Another issue is the way such universe can be justified by the setoid model, and in particular what principles are needed in the metatheory to do so.

*Contributions* This paper documents our work towards the construction of a universe of setoids inside the setoid model, and tries to answer these and other questions related to the design and implementation of this construction. Our main contribution is the construction of the universe in the model; this is given in steps, first as an inductive-recursive definition, which is then translated to an inductive-inductive definition, and subsequently to an inductive type. As a consequence, we show that we only need to assume indexed W-types and proofirrelevant identity types in the metatheory (along with some obligatory basic tools like Σ and Π types) to construct the universe.

The universe constructions presented in this paper are, to our knowledge, the first examples of two kinds of data type reductions in an intensional metatheory: the first involving an inductive-recursive type which includes strict propositions, and the second involving an infinitary inductive-inductive type.

Finally, the mathematical contents of this paper have been formalized in the proof-assistant Agda (see [10]).

*Structure of the paper* We begin by describing the metatheory that we will use throughout the paper, in Section 2. In Section 3, after briefly recalling *cate-*

<sup>5</sup> In the sense of HoTT we mean a type theory limited to h-sets.

<sup>6</sup> A strict model is one where every equation holds definitionally.

*gories with families* as an abstract notion of models of type theory, we outline Altenkirch's setoid model as given in [5]. We then briefly discuss the rules of Setoid Type Theory in Section 3.2.

In Section 4 we discuss the setoid model and various design choices related to it. We then recall inductive-recursive universes, and the way they can be equivalently defined as a plain inductive definition, in Section 4.1. We then provide, in Section 4.2, a first complete definition of the setoid universe using a special form of induction-recursion. This form of induction-recursion is not known to be reducible to plain inductive types. Then we describe an alternative definition of the universe in Section 4.3, that does not rely on induction-recursion but instead on infinitary induction-induction. This inductive-inductive encoding of the universe is obtained from the inductive-recursive one, inspired by the method of Section 4.1. We end the series of universe constructions with Section 4.4, where we outline a purely inductive definition of the setoid universe, obtained from the inductive-inductive one.

#### **1.1 Related work**

The setoid model was first described in [18] in order to add extensionality principles to Type Theory such as function extensionality and propositional extensionality. A strict variant of the setoid model was given in [4] using a definitionally proof-irrelevant universe of propositions. Recently, support for such a universe was added to the proof-assistants Agda and Coq [17], allowing a full formalization of Altenkirch's setoid model. Setoid Type Theory (SeTT) is a recently developed formal system derived from this model construction [5]. Observational Type Theory (OTT) [9] is a syntax for the setoid model differing from SeTT in the use of a different notion of heterogeneous equality. Moreover, the consistency proof for OTT relies on Extensional Type Theory, whereas for SeTT it is obtained via a syntactic translation. XTT [28] is a cubical variant of OTT where the equality type is defined using an interval pretype <sup>7</sup> . XTT's universes support universe induction, whereas it is left open whether the construction presented here supports this principle. Palmgren and Wilander [27] construct a setoid universe using a translation into constructive set theory. Palmgren [26] constructs an encoding of ETT in ITT through Aczel's encoding of set theory in type theory [3]. He uses type theory as a language for his formalisation but his construction is set-theoretic in nature. Setoids are utilized to encode sets as arbitrarily branching well-founded trees quotiented by bisimulation. His notion of family of setoids does not use strict propositions and it has a weaker form of proof irrelevance which seems to be not enough to obtain a model of SeTT.

The principle of propositional extensionality in the setoid model is an instance of Voevodsky's univalence axiom [29]. The cubical set model is a constructive model justifying this axiom [11]. A type theory extracted from this model is Cubical Type Theory [13]. The relationship between the cubical set

<sup>7</sup> To quote one of the referees: the fact that the interval is a pretype is but the easiest part of the story.

model and cubical type theory is similar to that between the setoid model and SeTT. Compared to cubical type theories, SeTT has the advantage that the equality type satisfies more definitional equalities. For instance, whereas in cubical type theory equality of functions is isomorphic to pointwise equality, in SeTT the isomorphism is replaced by a definitional equality. SeTT is also a syntactically straightforward extension of Martin-L¨of Type Theory, that does not require exotic objects like the interval pretype. In turn, the obvious advantage of cubical type theory is that it is not limited to setoids.

An exceptional aspect of the metatheory used in this paper is the presence of a proof-irrelevant identity type with a strong transport rule allowing to eliminate into arbitrary types. In [1], Abel gives a proof of normalization for the Logical Framework extended with a similar proof-irrelevant equality type. Abel and Coquand show in [2] that the combination of impredicativity with a strong transport rule results in terms that fail to normalize but this is irrelevant in our setting.

### **2 MLTTProp**

This section describes MLTT**Prop**, our ambient metatheory. We employ Agda notation to write down MLTT**Prop** terms throughout the paper.

One of the main appeals of Altenkirch's setoid model is that it can justify several useful extensionality principles while being defined in a small intensional metatheory. We tried to stay true to this idea when figuring out the necessary metatheoretical tools for the universe construction in this paper. In particular, we wanted to avoid having to assume strong definition schemas that go beyond inductive families. MLTT**Prop** is thus an intensional type theory in the style of Martin-L¨of type theory.

We have sorts **Type**<sup>i</sup> of types and **Prop**<sup>i</sup> of strict propositions for i ∈ {0, 1}. Here, i = 0 means "small" (and we will omit the subscript) and i = 1 means "large". We have implicit lifting from i = 0 to i = 1, but do not assume type formers are preserved. **Type**<sup>1</sup> has universes for **Type** and **Prop**. We do not distinguish notationally between universes and sorts. We continue to describe only the case i = 0; everything introduced has an analogue at level i = 1. Propositions lift to types via Lift : **Prop** → **Type**, with constructor lift : {P : **Prop**} → P → Lift P and destructor unlift : {P : **Prop**} → Lift P → P.

We have standard type formers Π, Σ, Bool, **0**, **1** in **Type**. Σ-types are defined negatively by pairing –, – and projections π1, π2. We have definitional η-rules for Π-, Σ-, **1**-types. We also require indexed W-types, both in **Type** and **Prop**: W- : (<sup>S</sup> : <sup>I</sup> <sup>→</sup> **Type**) <sup>→</sup> ((<sup>i</sup> : <sup>I</sup>) <sup>→</sup> S i <sup>→</sup> <sup>I</sup> <sup>→</sup> **Type**) <sup>→</sup> <sup>I</sup> <sup>→</sup> where - ∈ {**Type**, **Prop**}. The elimination principle of W**Prop** only allows defining functions into elements of **Prop**. From W**Prop** we can define propositional truncation – : **Type** → **Prop**, with constructor |–| : {A : **Type**} → A → A and eliminator elim– : {P : **Prop**} → (A → P) → A → P.

In addition to type formers in **Type**, we will need the propositional versions of **0**, **1**, Π, and Σ. The latter three can be defined from their **Type** counterparts via truncation. That is, given P : **Prop** and Q : P → **Prop**:

$$\begin{aligned} \mathbf{1\_{Prop}} & \equiv ||\mathbf{1}||\\ \boldsymbol{H}\_{\mathbf{Prop}} \; \boldsymbol{P} \; \boldsymbol{Q} & \equiv ||\boldsymbol{H} \; (\text{Lift} \; \boldsymbol{P}) \; (\text{Lift} \; \circ \boldsymbol{Q} \circ \text{unlift})||\\ \boldsymbol{\Sigma}\_{\mathbf{Prop}} \; \boldsymbol{P} \; \boldsymbol{Q} & \coloneqq ||\boldsymbol{\Sigma} \; (\text{Lift} \; \boldsymbol{P}) \; (\text{Lift} \; \circ \boldsymbol{Q} \circ \text{unlift})|| \end{aligned}$$

We assume that we have **0Prop** : **Prop** together with exfalso**Prop** : {A : **Type**} → **0Prop** → A.

Finally, we will assume an identity type in the style of Martin-L¨of's inductive identity type. The main difference is that our identity type is a **Prop**-valued relation. We have a transport combinator transp from which J is derivable.

$$\begin{aligned} \mathsf{ld} &: \{A : \mathsf{Type}\} \to A \to A \to \mathsf{Prop} \\ \mathsf{refl} &: \{A : \mathsf{Type}\} (a : A) \to \mathsf{ld} \ a \; a \\ \mathsf{transp} &: \{A : \mathsf{Type}\} (C : A \to \mathsf{Type}) \{a\_0 \; a\_1 : A\} \to \mathsf{ld} \; a\_0 \; a\_1 \to C \; a\_0 \to C \; a\_1 \end{aligned}$$

with transp C {x} {x} e u ≡ u. The transp combinator provides a strong elimination principle allowing to eliminate a strict proposition (the identity type) into arbitrary types. We only use this identity type in Section 4.4. For the rest of our constructions, the traditional Martin-L¨of's identity type suffices.

#### **2.1 Formalization**

A universe of strict propositions has been recently added to the Agda proof assistant [17], making *most* of MLTT**Prop** a subset of Agda, with the exception of the proof-irrelevant identity type. Most of the universe constructions presented here have been formalized and proof-checked using Agda, with the proof-irrelevant identity type and the strong transport rule added via postulates and rewriting. The formalization can be found in [10].

For convenience, we slightly deviate from MLTT**Prop** both in the paper and in the formalization, for instance by relying on pattern matching instead of eliminators, and using primitive versions of **Prop**-valued Π and Σ types instead of deriving them from truncation. We operate under the assumption that everything can be equivalently carried out in MLTT**Prop**, although we have not fully checked all the necessary details.

#### **3 Setoid model**

By *setoid model* we mean a class of models of type theory where contexts/closed types are interpreted as setoids, i.e. sets with an equivalence relation, and dependent types are interpreted as dependent/indexed setoids. A setoid model was first given for intensional type theory by M. Hofmann [18], in order to provide a semantics for extensionality principles such as function and propositional extensionality.

Here we consider a similar model construction due to Altenkirch [4]. The peculiarity of this model is that it is presented in a type theoretic and intensional metatheory which includes a strict universe of propositions.

The setoid model thus defined validates function extensionality, a universe of propositions with propositional extensionality, and quotient types. Therefore, it provides a way to bootstrap and "explain" extensionality, since the model construction effectively gives an implementation of various extensionality principles in terms of a small, completely intensional theory.

#### **3.1 Setoid model as a CwF**

The setoid model can be framed categorically as a category with families (CwF, [14]) with extra structure for the various type and term formers. The core structure of a CwF can be given as the following signature:

> Con : **Type** Ty : (Γ : Con) → **Type** Sub : (Γ Δ : Con) → **Type** Tm : (Γ : Con) → Ty Γ → **Type**

In our presentation of the setoid model, contexts are given by setoids, that is, types together with an equivalence relation. A key point of the model is that the equivalence relation is valued in **Prop** and is thus definitionally proof irrelevant.

> Γ : Con |Γ| : **Type** Γ <sup>∼</sup> : |Γ|→|Γ| → **Prop** refl Γ : (γ : |Γ|) → Γ <sup>∼</sup> γ γ sym Γ : ∀{γ<sup>0</sup> γ1} → Γ <sup>∼</sup> γ<sup>0</sup> γ<sup>1</sup> → Γ <sup>∼</sup> γ<sup>1</sup> γ<sup>0</sup> trans Γ : ∀{γ<sup>0</sup> γ<sup>1</sup> γ2} → Γ <sup>∼</sup> γ<sup>0</sup> γ<sup>1</sup> → Γ <sup>∼</sup> γ<sup>1</sup> γ<sup>2</sup> → Γ <sup>∼</sup> γ<sup>0</sup> γ<sup>2</sup>

Types in a context Γ are given by displayed setoids over Γ with a fibration condition given by coe, coh. In the following, we sometimes omit implicit quantifications such as the ∀{γ<sup>0</sup> γ1} in the type of sym Γ.

#### A : Ty Γ


This definition of types in the setoid model is different from the one in [4], but it is equivalent to it [12, Section 1.6.1]. The main difference here is in the use of a heterogeneous equivalence relation A<sup>∼</sup> in the definition of types.

Substitutions are interpreted as functors between the corresponding setoids, whereas terms of type A in context Γ are sections of the type seen as a setoid fibration Γ.A → Γ. Note that we only need to include components for the functorial action on objects and morphisms, since the functor laws follow from proof-irrelevance in the metatheory, and thus hold definitionally.

$$\begin{array}{c} \sigma: \text{Sub } \Gamma \,\, \Delta \\ \hline |\sigma|: |\Gamma| \to |\Delta| \\ \sigma^{\curvearrowleft}: \Gamma^{\curvearrowright} \rho\_{0} \,\rho\_{1} \to \Delta^{\curvearrowright} \left( |\sigma| \rho\_{0} \right) \left( |\sigma| \rho\_{1} \right) \end{array} \quad \begin{array}{c} t: \text{Tm } \Gamma \,\, A \\ \hline |t|: \left( \gamma : |\Gamma| \right) \to |A| \gamma \\ t^{\curvearrowright}: \left( p: \Gamma^{\curvearrowright} \gamma\_{0} \,\gamma\_{1} \right) \to A^{\curvearrowright} \ p \left( |t| \gamma\_{0} \right) \left( |t| \gamma\_{1} \right) \end{array}$$

We can show that the setoid model validates the usual basic type formers (Π, Σ, etc.), function extensionality and a universe of strict propositions with propositional extensionality [4]. Note that we do not need identity types or inductive types (W-types) for this.

#### **3.2 Setoid Type Theory**

The setoid model presented in the previous section is *strict*, that is, every equation of a CwF holds by definition in the semantics. One advantage of strict models is that they can be turned into *syntactic translations*, in which syntactic objects of the source theory are interpreted as their counterparts in another *target* theory. In the case of the setoid model, this gives rise to a *setoid translation*, where source contexts are interpreted as target contexts together with a target type representing the equivalence relation, and so on.<sup>8</sup>

A setoid translation is used in [5] to justify Setoid Type Theory (SeTT), an extension of Martin-L¨of type theory (+ **Prop**) with equality types for contexts and dependent types that reflect the setoid equality of the model.

We recall the rules of SeTT that extend regular MLTT below, but with a variation: whereas the equality types in [5] are stated as elements of SeTT's internal universe of propositions, here we state the context equalities as elements of the external, metatheoretic universe **Prop**. This generalises the notion of model of SeTT thus making it easier to construct models. Equality on types is defined as before in [5].

We have a universe of propositions Prop defined as follows:

$$\begin{array}{ccc} \frac{\Gamma : \text{Con}}{\mathsf{Prop} : \text{Ty } \Gamma} & \frac{\Gamma : \text{Tm } \Gamma \text{ } \mathsf{Prop}}{\underline{P} : \text{Ty } \Gamma} & \frac{u : \text{Tm } \Gamma \text{ } \underline{P} & v : \text{Tm } \Gamma \text{ } \underline{P}}{u \equiv v} \\ \end{array}$$

Equality type constructors for contexts and dependent types internalize the idea that every context and type comes equipped with a setoid equivalence relation. Note that **Prop** is the universe of the metatheory while Prop is the internal

<sup>8</sup> Semantically, this translation corresponds to a model construction, in particular a functor from the category of models of the target theory to the category of models of what will be Setoid Type Theory. Since the setoid translation is structural in the context component, we can work with models in the style of categories with families rather than contextual categories.

one. As in the model, equality for dependent types is indexed over context equality.

$$\begin{array}{c c c c c} \multicolumn{2}{c}{\Gamma : \mathsf{Con}} & \rho\_{0}, \rho\_{1} : \mathsf{Sub} \ \Delta \ \Gamma & \rho\_{01} : \Gamma \curvearrowright \ \rho\_{0} \ \rho\_{1} \\ \hline \Gamma : \mathsf{Con} \ \rho\_{0} \ \rho\_{1} : \mathsf{Prop} \end{array} \quad \begin{array}{c c c} A : \mathsf{Ty} \ \Gamma & \rho\_{01} : \Gamma \curvearrowright \ \rho\_{0} \ \rho\_{1} \\ \hline a\_{0} : \mathsf{Tm} \ \Delta \ A[\rho\_{0}] & a\_{1} : \mathsf{Tm} \ \Delta \ A[\rho\_{1}] \\ \hline A \curvearrowright \ \rho\_{01} \ a\_{0} \ a\_{1} : \mathsf{Tm} \ \Delta \ \mathsf{Prop} \end{array}$$

We have rules witnessing that these are indeed equivalence relations. We only recall reflexivity:

$$\frac{\rho : \text{Sub } \Delta \text{ } \Gamma}{\text{R } \rho : \Gamma \curvearrowright \rho \text{ } \rho} \quad \xrightarrow{A : \text{Ty } \Gamma \quad \quad \rho : \text{Sub } \Delta \text{ } \Gamma \quad \quad a : \text{Tm } \Delta \text{ } A[\rho]}{\text{R } a : \text{Tm } \Gamma \mathrel{\mathop{\cdot}} \text{m } \Gamma \mathrel{\mathop{\cdot}} \text{m } \rho \text{ } a \,\\_$$

In addition, we also have rules representing the fact that every construction in SeTT respects setoid equality, so that we can transport along any such equality:

$$\begin{array}{c} A: \text{Ty } \Gamma \qquad \rho\_0, \rho\_1: \text{Sub } \Delta \text{ } \Gamma \qquad p: \Gamma \curvearrowright \rho\_0 \text{ } \rho\_1 \qquad a: \text{Tm } \Delta \text{ } A[\rho\_0] \\ \hline \cline{2-4} \text{cce}\_A \ p \ a: \text{Tm } \Delta \text{ } A[\rho\_1] \\ \cline{2-4} \text{cce}\_A \ p \ a: \text{Tm } \Delta \text{ } A \curvearrowright \ p \ a \ (\mathsf{cce}\_A \ p \ a) \end{array}$$

Notably, equality types in SeTT compute definitionally on concrete type formers. In particular, they compute to their obvious intended meaning, so that an equality of pairs is a pair of equalities, an equality of functions is a map of equalities, and so on. From this, we get definitional versions of function and propositional extensionality.

We can easily recover the usual Martin-L¨of identity type from setoid equality, with transport implemented via coercion.

$$\begin{array}{c} A: \text{Ty } \Gamma \qquad a\_0, a\_1: \text{Tm } \Gamma \ A\\ \hline \text{ld}\_A \ a\_0 \ a\_1: \equiv A^\sim \text{ (R } \Gamma \text{) } a\_0 \ a\_1: \text{Tm } \Gamma \text{ } \mathsf{Prop} \\\\ \begin{array}{c} P: \text{Ty } (\Gamma.A) \qquad p: \text{Tm } \Gamma \ (\mathsf{ld} \ A \ a\_0 \ a\_1) \ \qquad t: \text{Tm } \Gamma \ P[a\_0] \\\hline \text{transp } P \ p \ t: \equiv \mathsf{coef } P \ (\mathsf{R id}, p) \ t: \text{Tm } \Gamma \ P[a\_1] \end{array} \\ \end{array}$$

We can also derive Martin-L¨of's J eliminator for this homogeneous identity type. The only caveat is that transp and the J eliminator do not compute definitionally on reflexivity.

#### **4 Universe of setoids**

As pointed out in the introduction, SeTT is seriously limited by the lack of a universes internalizing the notion of setoid. Our goal is to extend SeTT with a universe of setoids; since SeTT is a direct syntactic reflection of the setoid model, this essentially amounts to showing that a universe of setoids with the necessary structure and equations can be constructed within the setoid model. This opens several questions and possible design choices.

A first fundamental consideration has to do with the very definition of the setoid universe: as any type in the setoid model, this universe must be a setoid and thus come equipped with an equivalence relation. However, unlike the universe of propositions, a universe of setoids cannot be univalent, since this would force it to be a groupoid. The obvious choice is therefore to have a non-univalent universe, and instead define the universe's relation so that it reflects a simple syntactic equality of codes rather than setoid equivalence.

Another question has to do with the metatheoretic tools required to carry out the construction of the universe. In fact, one of the main aspects of the setoid model construction recalled in Section 3 and shown originally in [4] is that it can be carried out in a very small type theoretic metatheory, thus providing a way to reduce extensionality to a small intensional core. We would like to stay faithful to this ideal when constructing this setoid universe.

A known and established method for defining universes in type theory relies on induction-recursion (IR), a definition schema developed by Dybjer [15,16]. Inductive-recursive definitions can be found throughout the literature, from the already mentioned type theoretic universes, including the original formulation `a la Tarski by Martin-L¨of [24], to metamathematical tools like computability predicates.

Although universe constructions in type theory—including our own setoid universe—are naturally presented as inductive-recursive definitions, they may not necessarily require a metatheory with induction-recursion. In fact, it is possible to reduce some instances of induction-recursion to plain induction (more specifically, inductive families), including some universe definitions. We recall this reduction in Section 4.1.

Other design choices on the setoid universe are less essential, but still require careful consideration. For instance, one question is whether the setoid universe should support universe induction, thus exposing the inductive structure of the codes. Such an elimination principle is known to be inconsistent with univalence, although this is not an issue in our case; nevertheless it is not immediately clear if the elimination principle can be justified by the semantics, that is, if our encoding of the setoid universe in the model allows to define such a universe eliminator. The question arises because our final encoding of the setoid universe only supports a weak form of elimination, for reasons that are explained in Section 4.4. Although not currently needed, a stronger eliminator might be necessary to justify universe induction. This problem should not arise in the other encodings of the setoid universe (as given in Section 4.2 and Section 4.3).

Another design choice has to do with how the setoid universe relates to the other universes. One could provide a code for Prop in the setoid universe. Moreover, the setoid universes could form a hierarchy, possibly cumulative.

Yet another choice is whether to have two separate sorts, one for propositions and one for sets (with propositions convertible to sets) or a single sort of types (sets), with propositions given by elements of a universe of propositions, which is a (large) type. We have chosen to present the second option to fit with the standard notion of (unisorted) CwF. However, this has downsides: to even talk about propositions, we need to have a notion of large types. The first option is more symmetric: we can have parallel hierarchies for propositions and sets.

#### **4.1 Inductive-recursive universes**

An inductive-recursive universe is given by a type of codes U : **Type**, and a family El : U → **Type** that assigns, to each code corresponding to some type, the meta-theoretic type of its elements. The resulting definition is inductiverecursive because the inductive type of codes is defined simultaneously with the recursive function El.

An example is the following definition of a small universe with bool and Π.

$$\begin{aligned} \mathsf{data}\ \mathsf{U}: \mathsf{Type} & \qquad \mathsf{El}: \mathsf{U} \to \mathsf{Type} \\ \mathsf{bool}: \mathsf{U} & \qquad \mathsf{El} \text{bool}: \equiv \mathsf{2} \\ \mathsf{pi}: (A:\mathsf{U}) & \to (\mathsf{El}\ A \to \mathsf{U}) \to \mathsf{U} \end{aligned} \qquad \begin{aligned} \mathsf{El}: \mathsf{U} &\to \mathsf{Type} \\ \mathsf{El} \text{bool}: \equiv \mathsf{2} \\ \mathsf{El}\ (\mathsf{pi}\ A\ B): \equiv (a:\mathsf{El}\ A) \to \mathsf{El}\ (B\ a) \end{aligned}$$

Induction-recursion is arguably a nice and natural way to define internal universes in type theory, however it is not always strictly required. We can translate basic instances of induction-recursion into inductive families using the equivalence of I-indexed families of types and types over I (that is, A : **Type** with A → I) [22].

In our case, we can encode U as an inductive type inU that *carves out* all types in **Type** that are in the image of El. In other words, inU is a predicate that holds for any type that would have been obtained via El in the inductiverecursive definition. As El is indexed by the type of codes, the definition of inU quite expectedly reflects the inductive structure of codes.

$$\begin{aligned} \mathsf{data} & \mathsf{in}\mathsf{U}: \mathsf{Type} \to \mathsf{Type}\_1 \\ & \mathsf{in}\mathsf{Bool}: \mathsf{in}\mathsf{U} \, \mathsf{Z} \\ & \mathsf{in}\mathsf{Pi} \quad : \mathsf{in}\mathsf{U} \, \mathsf{A} \to \left( (a:A) \to \mathsf{in}\mathsf{U} \, (B~a) \right) \to \mathsf{in}\mathsf{U} \, \left( (a:A) \to (B~a) \right) \end{aligned}$$

U and El can be given by U :≡ Σ (A : **Type**) (in-U A) and El :≡ π1.

Note that this construction gives rise to a universe in **Type**1, rather than **Type**, since the definition of U quantifies over all possible types in **Type**. Hence this kind of construction requires a metatheory with at least one universe.

#### **4.2 Inductive-recursive setoid universe**

In this section we give a first definition of the setoid universe, as a direct generalization of the simple inductive-recursive definition just shown. We only consider a very small universe with bool type and Π for simplicity; a more realistic universe that includes more type formers can be found in the Agda formalization.

To construct the universe of setoids in the setoid model, we first of all need to define a type U : Ty Γ for every Γ : Con, and for every A : Tm Γ U a type El A : Ty Γ. Recalling Section 3, these are essentially record types made of several components. Since U is a closed type, it requires the same data of a setoid; in particular, we need a type of codes together with an equivalence relation reflecting equality of codes, in addition to proofs that these are indeed equivalence relations:

data U : **Type**<sup>1</sup> – ∼<sup>U</sup> – : U→U→ **Prop**<sup>1</sup> refl<sup>U</sup> : (A : U) → A ∼<sup>U</sup> A sym<sup>U</sup> : A ∼<sup>U</sup> B → B ∼<sup>U</sup> A trans<sup>U</sup> : A ∼<sup>U</sup> B → B ∼<sup>U</sup> C → A ∼<sup>U</sup> C

El is given by a family of setoids indexed over the universe, that is, a way to assign to each code in the universe a carrier set and an equivalence relation.

$$\mathsf{El}: \mathsf{U} \to \mathsf{Type}$$

$$\mathsf{I}-\mathsf{H}-\sim\mathsf{El}-\;\{a\;a':\mathsf{U}\} \to a\sim\mathsf{u}\;a' \to \mathsf{El}\;a \to \mathsf{El}\;a' \to \mathsf{Prop}$$

Note that – – ∼El – is indexed over equality on the universe, because El is a displayed setoid over U, hence in particular it must respect the setoid equality of U. We also require data and proofs that make sure we get setoids out of El:

$$\begin{aligned} &\mathsf{refl}\_{\mathsf{El}}: (A:\mathsf{U})(x:\mathsf{El}\ A) \to \mathsf{refl}\mathsf{l}\_{\mathsf{I}}\ A \vdash x \sim\_{\mathsf{El}} x\\ &\mathsf{sym}\_{\mathsf{El}}: p \vdash x \sim\_{\mathsf{El}} x' \to \mathsf{sym}\_{\mathsf{Il}}\ p \vdash x' \sim\_{\mathsf{El}} x\\ &\mathsf{trans}\_{\mathsf{El}}: p \vdash x \sim\_{\mathsf{El}} x' \to q \vdash x' \sim\_{\mathsf{El}} x'' \to \mathsf{trans}\_{\mathsf{Il}}\ p \neq x \sim\_{\mathsf{El}} x'\\ &\mathsf{coee}\_{\mathsf{El}}: A \sim\_{\mathsf{Il}} B \to \mathsf{El}\ A \to \mathsf{El}\ B\\ &\mathsf{coho}\_{\mathsf{El}}: (p:A \sim\_{\mathsf{Il}} A') \;(x:\mathsf{El}\ A) \to p \vdash x \sim\_{\mathsf{El}} \mathsf{coee}\_{\mathsf{El}}\ p\ x \end{aligned}$$

We give an inductive definition of U, mutually with a recursive definition of the 4 functions – ∼<sup>U</sup> –, refl<sup>U</sup> , El and – – ∼El –. The other functions are then recursively defined: reflEl alone, sym<sup>U</sup> and symEl mutually, trans<sup>U</sup> , transEl, coeEl and cohEl mutually. The whole construction is quite long, below we only show the more interesting definitions of U and El:

$$\begin{array}{llll} \mathsf{data } \mathsf{Id}: \mathsf{Type}\_{1} & & \mathsf{El } \mathsf{bool}: \equiv 2 \\ \mathsf{bool}: \mathsf{Id} & & \mathsf{El} \,(\mathsf{pi}\,\mathsf{A}\,\,\mathsf{B}\,\,\mathsf{h}): \equiv \\ \mathsf{pi}: (A:\mathsf{Id})(B:\mathsf{El}\,\,A\to\mathsf{U}) & & \Sigma\,\,(f: (a:\mathsf{E}\,\mathsf{l}\,\,A)\to\mathsf{El}\,\,(B\,\,a)) \\ \to (\{x\,\,x':\mathsf{El}\,\,A\}\to\mathsf{refl}\_{\mathsf{U}}\,\,A\vdash x\sim\_{\mathsf{El}} x' & & (\forall\{x\,\,x'\}(p:\mathsf{refl}\_{\mathsf{U}}\,\,A\vdash x\sim\_{\mathsf{El}} x')) \\ \to B\,\,x\sim\_{\mathsf{U}} B\,\,\,x')\to\mathsf{U} & & \to h\,\,p\vdash f\,\,x\sim\_{\mathsf{El}} f\,\,x' \\ \end{array}$$

Note that in the definition of U we require that the family B : El A → U be a setoid morphism, respecting the setoid equalities involved. This choice is crucial for the definition of El to go through, in particular since we eliminate the code for Π types into the setoid of functions that map equal elements to equal results. To state this mapping property we need to compare elements in different types, coming from applying f to different arguments x and x . We know that x and x are equal, but to conclude B x ∼<sup>U</sup> B x we need to know that B respects setoid equality. This is exactly what we get from our definition of U.

We can now give a full definition of the setoid universe, and of El A for any A : Tm Γ U:


We can show that U is closed under Π types and booleans, and satisfies <sup>E</sup>l(pi A B) <sup>≡</sup> <sup>Π</sup> (El A) (El B) and <sup>E</sup><sup>l</sup> bool <sup>=</sup> Bool. The universe can be closed under more constructions if more codes are added to U. This gives a complete definition of a universe of setoids, which is, however, inductive-recursive. Moreover, the kind of recursion involved in this definition is particularly complex, and not obviously reducible to well-understood notions of induction-recursion like the one described in [16]. In any case, we would like to avoid extending the metatheory with any form of induction-recursion in order to keep the metatheory as small and essential as possible.

In the next section we transform our current inductive-recursive definition to one that does not use induction-recursion. The way this is done is inspired by the well-known trick to eliminate induction-recursion described in Section 4.1, but modified in a novel way to account for the presence of **Prop**-valued types. To our knowledge, this is the first time this reduction method is applied to an inductive-recursive type of this kind.

#### **4.3 Inductive-inductive setoid universe**

We will follow the method outlined in Section 4.1. In addition to inU for defining U, we also introduce a family inU∼ of binary relations between types in the universe, from which we then define – ∼<sup>U</sup> –.

$$\begin{aligned} \mathsf{data} \; \mathsf{inU}: \mathsf{Type} &\to \mathsf{Type}\_1 \\ \mathsf{bool}: \mathsf{inU} &\ \mathsf{2} \\ \pi: \mathsf{inU} &\sim a \; a \; A\_{\sim} \to (\forall \{x\_0 \; x\_1\} (x\_{01}: A\_{\sim} \; x\_0 \; x\_1) \to \mathsf{inU} \sim (b \; x\_0) \; (b \; x\_1) \; (B \sim x\_{01})) \\ &\to \mathsf{inU} \; \mathsf{U} \; \left( \mathcal{E} \; \left( f: \left( x: A \right) \to B \; x \right) \right) \\ &\qquad \qquad \left( \left( x\_0 \; x\_1: A \right) (x\_{01}: A \sim x\_0 \; x\_1) \to B \; \sim x\_{01} \; (f \; x\_0) \; (f \; x\_1) \right) \end{aligned}$$

data inU∼ : {A A : **Type**} → inU A → inU A → (A → A → **Prop**) → **Type**<sup>1</sup>

bool<sup>∼</sup> : inU∼ bool bool (λx<sup>0</sup> x<sup>1</sup> . x<sup>0</sup> ? = x1) π<sup>∼</sup> : {b<sup>0</sup> : (x<sup>0</sup> : A0) → inU (B<sup>0</sup> x0)}{b<sup>1</sup> : (x<sup>1</sup> : A1) → inU (B<sup>1</sup> x1)} {a<sup>0</sup><sup>∼</sup> : inU∼ a<sup>0</sup> a<sup>0</sup> A<sup>0</sup>∼}{a<sup>1</sup><sup>∼</sup> : inU∼ a<sup>1</sup> a<sup>1</sup> A<sup>1</sup>∼} {b<sup>0</sup><sup>∼</sup> : ∀{x<sup>0</sup> x1}(x<sup>01</sup> : A<sup>0</sup><sup>∼</sup> x<sup>0</sup> x1) → inU∼ (b<sup>0</sup> x0) (b<sup>0</sup> x1) (B<sup>0</sup><sup>∼</sup> x01)} {b<sup>1</sup><sup>∼</sup> : ∀{x<sup>0</sup> x1}(x<sup>01</sup> : A<sup>1</sup><sup>∼</sup> x<sup>0</sup> x1) → inU∼ (b<sup>1</sup> x0) (b<sup>1</sup> x1) (B<sup>1</sup><sup>∼</sup> x01)} → inU∼ a<sup>0</sup> a<sup>1</sup> A<sup>01</sup><sup>∼</sup> → (∀{x<sup>0</sup> x1}(x<sup>01</sup> : A<sup>01</sup><sup>∼</sup> x<sup>0</sup> x1) → inU∼ (b<sup>0</sup> x0) (b<sup>1</sup> x1) (B<sup>01</sup><sup>∼</sup> x01)) → inU∼ (π a<sup>0</sup> a<sup>0</sup><sup>∼</sup> b<sup>0</sup> b<sup>0</sup>∼) (π a<sup>1</sup> a<sup>1</sup><sup>∼</sup> b<sup>1</sup> b<sup>1</sup>∼) (λf<sup>0</sup> f<sup>1</sup> . ∀(x<sup>0</sup> x1) → A<sup>01</sup><sup>∼</sup> x<sup>0</sup> x<sup>1</sup> → B<sup>01</sup><sup>∼</sup> x<sup>01</sup> (π<sup>1</sup> f<sup>0</sup> x0) (π<sup>1</sup> f<sup>1</sup> x1))

Just as the role of inU is, as before, to classify all types that are image of El, in the same way inU∼ a a classifies all relations of type A → A → **Prop** that are image of – – ∼El –, given proofs a : inU A, a : inU A . In particular, this definition of inU∼ states that the appropriate equivalence for boolean elements is the obvious syntactic equality – ? = –, whereas functions are to be compared pointwise. Note that inU appears in the sort of inU∼. Since these types are mutually defined, they form an instance of *induction-induction*, a schema that allows the definition of a type mutually with other types that contain the first one in their signature [25].<sup>9</sup>

As in the universe example in Section 4.1, we now define U as a Σ type, and El as the corresponding first projection.

$$\begin{aligned} \mathcal{U}: \mathsf{Type}\_1 \\ \mathcal{U}: \equiv \Sigma \ (X: \mathsf{Type}) \ (\mathsf{in} \mathsf{U} \ X) \end{aligned} \qquad \begin{aligned} \mathsf{El}: \mathcal{U} &\to \mathsf{Type} \\ \mathsf{El}: \equiv \pi\_1 \end{aligned}$$

What is left now is to define the setoid equality relation on the universe, as well as the setoid equality relation on El A for any A in U. Two codes A, B in the universe U are equal when there exists a setoid equivalence relation on their respective sets El A and El B. Intuitively, since elements of a setoid are only ever compared to elements of the same setoid, this should only be possible if A and B are codes for the same setoid, that is, if A ∼<sup>U</sup> B. Existence and well-formedness of such relations is expressed via the type inU∼ just defined, hence we would expect A ∼<sup>U</sup> B to be defined as follows:

$$(A, a) \sim\_{\mathcal{U}} (B, b) : \equiv \Sigma \ (R: A \to B \to \mathbf{Prop}) \text{ (in} \mathbb{U} \sim a \ b \ R).$$

Unfortunately this definition only manages to capture the idea, but does not actually typecheck. In fact, – ∼<sup>U</sup> – should be a **Prop**1-valued relation, so A ∼<sup>U</sup> B should be a proposition. However, the Σ type shown above clearly is not, since it quantifies over a type of relations, which is not a proposition. One possible solution is actually quite simple, and it just involves truncating the Σ type above to force it to be in **Prop**1.

$$\begin{aligned} -\sim\_{\mathcal{U}} - & \colon \mathcal{U} \to \mathcal{U} \to \mathbf{Prop}\_1\\ (A, a) \sim\_{\mathcal{U}} (B, b) & \equiv \parallel \Sigma \ (R : A \to B \to \mathbf{Prop}) \ (\mathsf{in} \mathsf{U} \sim a \ b \ R) \Vert \mathsf{b} \end{aligned}$$

We are now left to define the indexed equivalence relation on El:

$$\begin{aligned} & - \vdash - \sim\_{\mathsf{El}} - \colon \{ A \: B : \mathsf{U} \} \to A \multimap\_{\mathsf{U}} B \to \mathsf{El} \, A \to \mathsf{El} \, B \to \mathsf{Prop} \\ & p \vdash x \sim\_{\mathsf{El}} y \; \equiv \text{ ?} \end{aligned}$$

In the definition above, p has type Σ (R : El A → El B → **Prop**) (...). If the type was not propositionally truncated, we could define p x ∼El y by extracting the relation out of the first component of p, and apply it to x, y. That is, p x ∼El y :≡ π<sup>1</sup> pxy. This would make the definition of – ∼<sup>U</sup> – and – – ∼El – in line with how we defined U and El.

However, this does not work in our case, since the type of p *is* propositionally truncated, hence it cannot be eliminated to construct a proof-relevant object. Fortunately, we can work around this limitation by defining p x ∼El y by induction on the codes A B : U, in a way that ends up being logically equivalent to the proposition we would have obtained by π<sup>1</sup> pxy if there were no truncation.

<sup>9</sup> The main example of induction-induction is the intrinsic definition of a dependent type theory in type theory [6].

More precisely, we need to construct proofs that for any concrete R and inR, the types |(R, inR)| x ∼El y and Rxy are logically equivalent. These in turn need to be defined mutually with – – ∼El –. We direct the interested reader to the Agda formalization for the full details of these definitions, as they are quite involved.

The full definition of the universe is concluded with the remaining definitions, like refl<sup>U</sup> ,reflEl, etc., which can be adapted from their IR counterparts more or less straightforwardly. The final result does not use induction-recursion, but it is nevertheless an instance of infinitary induction-induction. The ability to define arbitrary, infinitary inductive-inductive types clashes, again, with our objective of keeping the metatheory as small and simple as possible. The next step is therefore to reduce this inductive-inductive universe to one that does not require (infinitary) induction-induction.

#### **4.4 Inductive setoid universe**

This section encodes the inductive-inductive universe of setoids from the previous section without assuming arbitrary inductive-inductive definitions in the metatheory.

Before turning our attention to the setoid universe, we recall the known, systematic method to reduce finitary inductive-inductive types to inductive families.

**Reducing finitary induction-induction** It is known that finitary inductiveinductive definitions can be reduced to inductive families [8,7,21]. To illustrate the idea, let us consider a well-known example of a finitary inductive-inductive type, the intrinsic encoding of type theory in type theory itself. Actually, we only consider the type of contexts Con : **Type** and the type of types Ty : Con → **Type**; since the latter is indexed over the former, this is already an example of induction-induction.

Contexts in Con are formed out of empty contexts • and context extension –, –. Types in Ty are either the base type ι or Π types.

• : Con ι : (Γ : Con) → Ty Γ –, –:(Γ : Con) → Ty Γ → Con Π : {Γ : Con}(A : Ty Γ) → Ty (Γ, A) → Ty Γ

The general method to eliminate induction-induction is to split the original inductive-inductive types into a type of codes and associated well-formedness predicates. In our Con/Ty example, these would be respectively given by codes Con0,Ty<sup>0</sup> : **Type** and predicates Con<sup>1</sup> : Con<sup>0</sup> → **Type**,Ty<sup>1</sup> : Con<sup>0</sup> → Ty<sup>0</sup> → **Type**.

The definition of the codes and predicate types follows that of the original inductive-inductive type, and can be derived systematically from it. More importantly, they can be defined without induction-induction, since although Con<sup>0</sup> and Ty<sup>0</sup> are defined mutually, their sorts are not indexed.

•<sup>0</sup> : Con<sup>0</sup> –,<sup>0</sup> – : Con<sup>0</sup> → Ty<sup>0</sup> → Con<sup>0</sup> ι<sup>0</sup> : Con<sup>0</sup> → Ty<sup>0</sup> Π<sup>0</sup> : Con<sup>0</sup> → Ty<sup>0</sup> → Ty<sup>0</sup> → Ty<sup>0</sup> •<sup>1</sup> : Con<sup>1</sup> •<sup>0</sup> –,<sup>1</sup> – : ∀{Γ<sup>0</sup> A0} → Con<sup>1</sup> Γ<sup>0</sup> → Ty<sup>1</sup> Γ<sup>0</sup> A<sup>0</sup> → Con<sup>1</sup> (Γ<sup>0</sup> ,<sup>0</sup> A0) ι<sup>1</sup> : ∀{Γ0} → Con<sup>1</sup> Γ<sup>0</sup> → Ty<sup>1</sup> Γ<sup>0</sup> (ι<sup>0</sup> Γ0) Π<sup>1</sup> : ∀{Γ<sup>0</sup> A<sup>0</sup> B0} → Con<sup>1</sup> Γ<sup>0</sup> → Ty<sup>1</sup> Γ<sup>0</sup> A<sup>0</sup> → Ty<sup>1</sup> (Γ<sup>0</sup> ,<sup>0</sup> A0) B<sup>0</sup> → Ty<sup>1</sup> Γ<sup>0</sup> (Π<sup>0</sup> Γ<sup>0</sup> A<sup>0</sup> B0)

We can recover the original inductive-inductive type as Con :≡ Σ (Γ<sup>0</sup> : Con0) (Con<sup>1</sup> Γ0) and Ty Γ :≡ Σ (A<sup>0</sup> : Ty0) (Ty<sup>1</sup> (π<sup>1</sup> Γ) A0). Recovering the constructors is straightforward:

$$\begin{aligned} \bullet \quad \begin{array}{c} := (\bullet\_0, \bullet\_1) \\ (\varGamma\_0, \varGamma\_1), (A\_0, A\_1) \\ \qquad := ((\varGamma\_0, \varGamma\_0), (\varGamma\_1, \varGamma\_1)) \\ \qquad := (\iota\_0 \varGamma\_0, \iota\_1 \varGamma\_1) \\ \end{array} \\ \equiv (\iota\_0 \varGamma\_0, \varGamma\_1) \\ \end{aligned} \\ \equiv (\iota\_0 \varGamma\_0, \varGamma\_1) \\ \equiv (\varGamma\_0 \varGamma\_0 \, A\_0 \, B\_0, \varPi\_1 \, \varGamma\_1 \, A\_1 \, B\_1) $$

Finally, we can define eliminators/induction principles for Con and Ty as just defined, by induction on the well-typing predicates.

Following [25], we distinguish two versions of the eliminator: the *simple* and the *general* one. Note that this is orthogonal to the distinction between nondependent and dependent eliminators, from which we only consider the latter. The motives for the simple eliminator are C : Con → **Type**, T : (Γ : Con)(A : Ty Γ) → **Type** and the eliminators themselves have the following signatures:

$$\mathsf{elim}'\_{\mathsf{Con}} : (\varGamma : \mathsf{Con}) \to C' \varGamma \qquad \qquad \mathsf{elim}'\_{\mathsf{Ty}} : \forall \{\varGamma\} (A : \mathsf{Ty} \varGamma) \to T' \varGamma \ A$$

In the case of the general eliminator, the motive for Ty depends on the motive for Con, making the two eliminators *recursive-recursive* functions. For motives C : Con → **Type** and T : (Γ : Con) → Ty Γ → C Γ → **Type** the signatures are:

$$\mathsf{elim}\_{\mathsf{Con}} \colon (\varGamma : \mathsf{Con}) \to C \varGamma \qquad \qquad \mathsf{elim}\_{\mathsf{T}\mathfrak{y}} \colon \forall \{\varGamma\} (A \colon \mathsf{T} \mathsf{y} \ \varGamma) \to T \ \varGamma \ A \ (\mathsf{elim}\_{\mathsf{Con}} \ \varGamma)$$

The general eliminators can be derived from our encoding of Con and Ty via untyped codes and well-typing predicates. The way to do it is to first define the graph of the eliminators in the form of inductively-generated relations:

> data R-Con : (Γ : Con) → C Γ → **Type** data R-Ty : {Γ : Con}(A : Ty Γ)(γ : C Γ) → T Γ Aγ → **Type**

The next step is to prove that these relations are functional, by induction on the untyped codes Con<sup>0</sup> and Ty<sup>0</sup> [21]. From this result, defining the eliminators is immediate.

**Reducing the setoid universe** The reduction described in the previous section works generically for an arbitrary finitary inductive-inductive type, thus giving a systematic way to reduce finitary inductive-inductive definitions to inductive families. However, it is not clear whether this method extends to *infinitary* induction-induction, of which the setoid universe defined in Section 4.3 is an instance. Of course, the absence of a general reduction method does not mean that we cannot reduce particular concrete instances of infinitary inductioninduction, which is exactly what we hope for our universe construction.

The obvious challenge in successfully completing this reduction is to avoid the need for extensionality in the metatheory. In fact, consider the simple infinitary inductive-inductive type obtained from the previous Con/Ty example by replacing the finitary constructor Π with an infinitary one: Π : {Γ : Con} → (<sup>N</sup> <sup>→</sup> Ty <sup>Γ</sup>) <sup>→</sup> Ty <sup>Γ</sup>. Already with this simple example, we run into problems as soon as we try to define the eliminator. One issue is that the definition of the eliminator relies on a proof that the well-typing predicates inU1, inU∼<sup>1</sup> are propositional, that is, any two of their elements are equal. Without further assumptions this proof can only be done by induction, and requires function extensionality since these predicates include higher-order constructors.

One way to get around this is to define the well-typing predicates as **Prop**valued families, rather than in **Type**:

data inU<sup>0</sup> : **Type** → **Type**<sup>1</sup> data inU∼<sup>0</sup> : {A A : **Type**} → (A → A → **Prop**) → **Type**<sup>1</sup> data inU<sup>1</sup> : (A : **Type**) → inU<sup>0</sup> A → **Prop**<sup>1</sup> data inU∼<sup>1</sup> : {A A : **Type**} → (R : A → A → **Prop**) → inU∼<sup>0</sup> R → **Prop**<sup>1</sup>

Using **Prop** avoids the issue of proving propositionality altogether, since the predicates are now propositional by definition. However, it introduces a different issue: inU<sup>1</sup> and inU∼<sup>1</sup> give rise to equational constraints on their indices, in the form of proofs of the **Prop**-valued identity type. The definition of the eliminators for inU and inU∼ relies on the ability to transport along these proofs, hence the need to extend our metatheory with a primitive, strong form of transport for Id. 10

Having **Prop** and a strong transport principle does help to some extent. However, we would still need extensionality to derive the general eliminators for inU and inU∼. In fact, as explained in the previous section, to derive the general recursive-recursive eliminators we need to prove that the corresponding graph relations are functional, which cannot be done without funext.

Luckily, the *simple* elimination principle is sufficient for our purposes: all functions described in Section 4.3 can be defined just using the simple eliminator without recursion-recursion. The simple eliminator itself can be defined by pattern matching on the untyped codes, and does not require extensionality or any extra principles beyond strong transport.

Once the inductive encoding of the inductive-inductive universe is done, the setoid universe can be defined just as in Section 4.3.

<sup>10</sup> Note that this issue cannot be solved by expressing the equational constraints with an identity type in **Type**, since the well-typing predicates force it to necessarily be in **Prop**.

#### **5 Conclusions and further work**

We have described the construction of a universe of setoids in the setoid model of type theory; this is given in several steps, first as an inductive-recursive definition, then as an inductive-inductive definition, and finally as an inductive type. Every encoding is obtained from the previous by adapting known data type transformation methods in a novel way that accounts for the peculiarities of our construction. In [5] we present rules for SetTT, clearly these rules need to be extended by the rules for a universe reflecting the semantics presented here.

It is known that finitary IITs can be reduced to inductive types in an extensional setting [21]. In our paper we reduce an infinitary IIT to inductive types in an intensional setting. In the future, we would like to investigate whether this reduction can be generalised to arbitrary infinitary IITs.

In contrast to the inductive-recursive and inductive-inductive versions of the universe, the inductive definition relies on a metatheory with a strong transport rule. As future work, we would like to prove normalization for this metatheory since previous work in this respect [2] seems to suggest that is represents a non-trivial addition.

Another question regards the relationship between SeTT [5] and XTT [28]. Both systems are syntactic representations of the setoid model with similar design choices, like definitional proof-irrelevance. We would like to know whether their respective notions of models are equivalent, that is, if we can obtain an XTT model from a SeTT model, and vice versa. Since XTT universes support universe induction, for one direction we would need to extend our own universe with the same principle (see discussion in Section 3 and the previous paragraph). Thus a related question is whether our encodings of the setoid universe can support universe induction. A further question is whether this mapping of models is functorial.

Groupoids can be regarded as generalized setoids. In the future we would like to design a type theory internalizing the groupoid model of type theory [20], in the same way that SeTT represents a syntax for the setoid model. A further question is whether such "groupoid type theory" can be justified, similarly to SeTT, via a syntactic translation, perhaps with SeTT itself as the target theory.

#### **References**


Languages, pages 1–28, January 2019. URL: https://hal.inria.fr/hal-01859964, doi:10.1145/329031610.1145/3290316.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

#### **Nominal Equational Problems***-*

Mauricio Ayala-Rinc´on<sup>1</sup> , Maribel Fern´andez<sup>2</sup> -, Daniele Nantes-Sobrinho<sup>1</sup> , and Deivid Vale<sup>3</sup>

<sup>1</sup> Departments of Computer Science and Mathematics, Universidade de Bras´ılia, Bras´ılia D.F., Brazil

{ayala, dnantes}@unb.br <sup>2</sup> Department of Informatics, King's College London, London, UK

maribel.fernandez@kcl.ac.uk

<sup>3</sup> Department of Software Science, Radboud University Nijmegen, Nijmegen, The Netherlands

deividvale@cs.ru.nl

**Abstract.** We define *nominal equational problems* of the form ∃*W*∀*Y* : *P*, where *P* consists of conjunctions and disjunctions of equations *s* ≈*<sup>α</sup> t*, freshness constraints *a*#*t* and their negations: *s* ≈*<sup>α</sup> t* and *a* #*t*, where *a* is an atom and *s, t* nominal terms. We give a general definition of solution and a set of simplification rules to compute solutions in the nominal ground term algebra. For the latter, we define notions of solved form from which solutions can be easily extracted and show that the simplification rules are sound, preserving, and complete. With a particular strategy for rule application, the simplification process terminates and thus specifies an algorithm to solve nominal equational problems. These results generalise previous results obtained by Comon and Lescanne for first-order languages to languages with binding operators. In particular, we show that the problem of deciding the validity of a first-order equational formula in a language with binding operators (i.e., validity modulo *α*-equality) is decidable.

**Keywords:** Nominal syntax · Unification · Disunification.

### **1 Introduction**

*Nominal unification* [23] is the problem of solving equations modulo *α*-equivalence. A solution consists of a substitution and a freshness context ∇, i.e., a set of primitive constraints of the form *a*#*X* (read: "*a* is fresh for *X*"), which intuitively means that *a* cannot occur free in the instances of *X*. Nominal unification is decidable and unitary [23], and efficient algorithms exist [5,17], which can be used to solve problems of the form ∃*X* ( - *Δ<sup>i</sup> s<sup>i</sup>* ≈*<sup>α</sup> ti*), where *si*, *t<sup>i</sup>* are nominal terms with variables *X* and *Δ<sup>i</sup>* is a freshness context.

First author partially founded by PrInt MAT-UnB-CAPES and CNPq grant numbers Ed 41/2017 and 07672/2017-4. Third author partially supported by DPI/UnB - 03/2020. Fourth author supported by NWO TOP project "Implicit Complexity through Higher Order Rewriting" (ICHOR), NWO 612.001.803/7571.

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 22–41, 2021. https://doi.org/10.1007/978-3-030-71995-1 2

Similarly, nominal disunification is the problem of solving disequations i.e., negated equations of the form *s* ≈*<sup>α</sup> t*. An algorithm to solve *nominal constraint problems* of the form

$$\mathcal{P} := \exists \overline{X} \left( \left( \bigwedge \Delta\_i \vdash s\_i \approx\_\alpha t\_i \right) \wedge \left( \bigwedge \nabla\_j \vdash p\_j \not\approx\_\alpha q\_j \right) \right)$$

is available [1], which finds solutions in the nominal term algebra <sup>T</sup> (*Σ,* <sup>A</sup>*,* <sup>X</sup>) by constructing suitable representation of the witnesses for the variables in P.

Comon and Lescanne [10] investigated a more general version of this problem, called *equational problem*, in their words: "an equational problem is any firstorder formula whose only predicate symbol is =", that is, it has the form ∃*w*1*,...,w<sup>n</sup>*∀*y*1*,...,y<sup>m</sup>* : *P* where *P* is a *system*, i.e., an equation *s* = *t*, or a disequation *s* = *t*, or a disjunction of systems *Pi*, or a conjunction of systems - *Pi*, or a failure ⊥, or success . The study of such problems was motivated by applications in pattern-matching for functional languages, sufficient completeness for term rewriting systems, negation in logic programming languages, etc.

In order to extend these applications to languages that offer support for binders and *α*-equivalence following the nominal approach, such as *α*Prolog [6], *α*Kanren [4], *α*LeanTAP [20], to nominal rewriting [14] and nominal (universal) algebra [15], in this paper we consider *nominal equational problems*.

Based on Comon and Lescanne's work, the nominal extension of a first-order equational problem is a formula P ::= ∃*W*<sup>1</sup> *...W<sup>n</sup>*∀*Y*<sup>1</sup> *...Y<sup>m</sup>* : *P* where *P* is a *nominal system*, i.e., a formula consisting of conjunctions and disjunctions of freshness, equality constraints, and their negations.

*Contributions.* This paper introduces nominal equational problems (NEPs) and presents simplification rules to find solutions in the ground nominal algebra. The simplification rules are shown to be terminating (by using a measure that strictly decreases with each rule application), and also sound and solution-preserving. The simplification process for NEPs is more challenging than in the syntactic case because it deals with two predicates (≈*<sup>α</sup>* and #) and needs to consider the interaction between freshness and *α*-equality constraints, and quantifiers. The elimination of universal quantifiers requires careful analysis since universal variables may occur in freshness constraints and in their negations. To make the process more manageable, we define a set of rules together with a strategy of application (specified by rule conditions) that simplifies the termination proof.

Finally, we show that the irreducible forms are either ⊥ or problems from which a solution can be easily extracted. In particular, if the NEP consists only of existentially quantified conjunctions of freshness and *α*-equality constraints, we obtain solved forms consisting of a substitution and a freshness context, as in the standard nominal unification algorithm [23].

*Related Work.* Comon and Lescanne [10] introduced first-order equational problems and studied their solutions in the algebra of rational trees, the initial term algebra, and the ground term algebra. A restricted version of equational problems, called disunification problems, which do not contain quantified variables, has been extensively studied in the first-order framework [8,3,11,2,22]. More recently, a nominal approach to disunification problems was proposed by Ayala et.al [1], including only conjunctions of equations and disequations and freshness constraints, without quantified variables. Here we generalise this previous work to deal with general formulas including disjunction, conjunction and negation of equations and freshness constraints, as well as existential and universal quantification over variables. To deal with negation of freshness, disjunctive formulas, and quantification we extend the semantic interpretation and design a different set of simplification rules as well as a more elaborated strategy for rule application.

Extensions of first-order equational problems modulo equational theories have also been considered. Although the problem of solving disequations modulo an equational theory is not even semi-decidable in general (as shown by Comon [7]), there are useful decidable and semi-decidable cases. For example, solvability of complement problems (a sub-class of equational problems) is decidable modulo theories with permutative operators (which include commutative theories) [9,13], and for linear complement problems solvability modulo associativity and commutativity is also decidable [16,19,12]. Buntine and B¨urckert [3] solve systems of equations and disequations in equational theories with a finitary unification type. Fern´andez [11] shows that *E*-disunification is semi-decidable when the theory *E* is presented by a ground convergent rewrite system, and gives a sound and complete *E*-disunification procedure based on narrowing. Baader and Schulz [2] show that solvability of disunification problems in the free algebra of the combined theory *E*<sup>1</sup> ∪ *...* ∪ *E<sup>n</sup>* is decidable if solvability of disunification problems with linear constant restrictions in the free algebras of the theories *Ei*(1 ≤ *i* ≤ *n*) is decidable. Lugiez [18] introduces higher-order disunification problems and gives some decidable cases for which equational problems can be extended to higher-order systems.

*Organisation.* Section 2 recalls the main concepts of nominal syntax and semantics. Section 3 introduces nominal equational problems and a notion of solution for such problems. Section 4 presents a rule-based procedure for solving NEPs, as well as soundness, preservation of solutions, and termination results. Section 5 shows that the simplification rules reach solved forms from which solutions can be easily extracted. Section 6 concludes and discusses future work.

### **2 Background**

We assume the reader is familiar with nominal techniques and recall some concepts and notations that shall be used in the paper; for more details, see [14,21,23].

*Nominal Terms.* We fix countable infinite pairwise disjoint sets of *atoms* A = {*a, b, c, . . .*} and *variables* <sup>X</sup> <sup>=</sup> {*X, Y, Z, . . .*}. Atoms follow the *permutative convention*: names *a, b* range permutatively over A. Therefore, they represent different objects. Let *Σ* be a finite set of term-formers disjoint from A and X such that for each *f* ∈ *Σ*, a unique non-negative integer *n* (the arity of *f*, written as *f* : *n*) is assigned. We assume there is at least one *f* : *n* such that *n >* 0.

<sup>A</sup> *permutation <sup>π</sup>* is a bijection <sup>A</sup> <sup>→</sup> <sup>A</sup> with finite domain, i.e., the set dom(*π*) := {*<sup>a</sup>* <sup>∈</sup> <sup>A</sup> <sup>|</sup> *<sup>π</sup>*(*a*) <sup>=</sup> *<sup>a</sup>*} is finite. We shall represent permutations as lists of *swappings π* = (*a*<sup>1</sup> *b*1)(*a*<sup>2</sup> *b*2)*...*(*a<sup>n</sup> bn*). The identity permutation is denoted by id and *π* ◦ *π* the composition of *π* and *π*- . The set P of all such permutations together with the composition operation form a group (P*,* ◦) and it will be denoted simply by P. The *difference set* of *π* and *γ* is defined by ds(*π, γ*) = {*<sup>a</sup>* <sup>∈</sup> <sup>A</sup> <sup>|</sup> *<sup>π</sup>*(*a*) <sup>=</sup> *<sup>γ</sup>*(*a*)}.

**Definition 1 (Nominal Terms).** *The set T*(*Σ,* A*,* X) *of Nominal Terms, or just terms for short, is inductively defined by the following grammar:*

$$s, t, u ::= a \mid \pi \cdot X \mid [a]t \mid f(t\_1, \ldots, t\_n),$$

*where a is an* atom*, π* · *X is a moderated variable,* [*a*]*t is the* abstraction *of a in the term t, and f*(*t*1*,...,tn*) *is a* function application *with f* ∈ *Σ and f* : *n. A term is ground if it does not contain variables.*

In an abstraction [*a*]*t*, *t* is the scope of the binder [·] and it *binds* all free occurrences of *a* in *t*. An occurrence of an atom in a term is *free* if it is not under the scope of a binder. Notice that syntactical equality is not modulo *α*-equivalence; for example, [*a*]*a* ≡ [*b*]*b*. We may denote *s* ≡ *t* by *s* = *t* with the same intended meaning and *t* ˜ abbreviates an ordered sequence *t*1*,...,t<sup>n</sup>* of terms.

*Example 1.* Let *Σ<sup>λ</sup>* := {lam : 1*,* app : 2} be a signature for the *λ*-calculus. Using atoms to represent variables, *λ*-expressions are generated by the grammar:

$$e ::= a \mid \mathbf{1am}([a]e) \mid \mathbf{app}(e,e)$$

As usual, we sugar app(*s, t*) to *s t* and lam([*a*]*s*) to *λ*[*a*]*s*. The following are examples of nominal terms: (*λ*[*a*]*a*) *X* and (*λ*[*a*](*λ*[*b*]*b a*) *c*) *d*.

We inductively extend the action of a permutation *π* to a term *t*, denoted as *π* · *t*, by setting: *π* · *a* = *π*(*a*)*, π* · (*π*- · *X*)=(*π* ◦ *π*- ) · *X, π* · ([*a*]*t*) = [*π*(*a*)](*π* · *t*), and *π* · *f*(*t* ˜) = *<sup>f</sup>*(*<sup>π</sup>* · *<sup>t</sup>* ˜).

*Substitutions*, ranging over *σ, γ, τ . . .*, are maps (with finite domain) from variables to terms. The *action of a substitution σ* on a term *t*, denoted *tσ*, is inductively defined by: *aσ* = *a,*(*π* · *X*)*σ* = *π* · (*Xσ*), ([*a*]*t*)*σ* = [*a*](*tσ*) and *f*(*t*1*,...,tn*)*σ* = *f*(*t*1*σ, . . . , tnσ*). Notice that *t*(*σγ*)=(*tσ*)*γ*.

**Definition 2 (Positions and subterms).** *Let s be a nominal term. The set* Pos(*s*) *of* positions *in s is a set of strings over positive integers defined inductively below. Additionally, s*|*<sup>p</sup> denotes the subterm of s at position p and s*(*p*) *denotes the symbol at position p.*

$$\begin{array}{l} \text{-} \quad \text{If } s = a \text{ or } s = \pi \cdot X, \text{ then } \mathsf{Pos}(s) = \{\epsilon\} \text{ and } s|\_{\epsilon} = s;\\ \text{-} \quad \text{-} \quad \text{if } s = [a] \text{ then } \mathsf{Pos}(s) = \{\epsilon\} \cup \{1 \cdot p \mid p \in \mathsf{Pos}(t)\}, \, s|\_{\epsilon} = s \text{ and } s|\_{1 \cdot p} = t|\_{p};\\ \text{-} \quad \text{-} \quad \text{if } s = f(s\_{1}, \ldots, s\_{n}) \text{ then } \mathsf{Pos}(s) = \{\epsilon\} \cup \bigcup\_{i=1}^{n} \{i \cdot p \mid p \in \mathsf{Pos}(s\_{i})\}, \, s|\_{\epsilon} = s \text{ and } s|\_{\epsilon} = s|\_{\epsilon} = s \text{ and } s|\_{\epsilon} = s|\_{p}. \end{array}$$

*Freshness and α-equality.* A *nominal equation* is the symbol or an expression *s* ≈*<sup>α</sup> t* where *s* and *t* are nominal terms. A *trivial equation* is either *s* ≈*<sup>α</sup> s* or . *Freshness constraints* have the form *a*#*t* where *a* is an atom and *t* a term. A *freshness context* is a finite set of *primitive* freshness constraints of the form *a*#*X*, we use *Δ,* ∇, and *Γ* to denote them. We extend the notation to sets of atoms: *A*#*X* denotes that *a*#*X* for every *a* ∈ *A*.

*α*-derivability is given by the deduction rules in Figure 1, which define an *equational theory* called CORE.


**Fig. 1.** CORE freshness and *α*-equality rules.


straints from ∇ as assumptions *s* is *α*-equivalent to *t*.

*Semantic Notions.* Nominal equational theory has a natural semantic denotation in *nominal sets* since we can easily interpret freshness and abstraction.

<sup>A</sup> <sup>P</sup>-set *<sup>X</sup>* is an ordinary set equipped with an action in <sup>P</sup> <sup>×</sup> *<sup>X</sup>* <sup>→</sup> *<sup>X</sup>* (written as *π* · *x*) such that id · *x* = *x* and *π* · (*π*- · *x*)=(*π* ◦ *π*- ) · *<sup>x</sup>*. A set of atoms *<sup>A</sup>* <sup>⊂</sup> <sup>A</sup> *supports <sup>x</sup>* <sup>∈</sup> *<sup>X</sup>* iff for all permutations *<sup>π</sup>* <sup>∈</sup> <sup>P</sup> fixing every element of *<sup>A</sup>* · acts trivially on *x* via *π*, i.e., if *π*(*a*) = *a* for all *a* ∈ *A* then *π* · *x* = *x*. *Semantic freshness* is defined in terms of support as follows: an atom *a* is fresh for *x* ∈ *X* iff *a /*<sup>∈</sup> supp(*x*). We denote this by writing *<sup>a</sup>*#sem*x*. A *nominal set* is a <sup>P</sup>-set such that every element is finitely supported.

To build an algebraic ground term-model of CORE, we fix the set *G* consisting of equivalence classes of provable *α*-equivalent ground terms. More precisely, given a ground term *g*, the class *g* is the set of ground terms *g*for which there exist a derivation *g* ≈*<sup>α</sup> g*- . Note that *G* is a nominal set by defining the natural action: *π*·*g* = *π* · *g*. Each function symbol *f* ∈ *Σ* is interpreted by an *equivariant function f*<sup>I</sup> mapping (*t*1*,...,tn*) → *f*(*t*1*,...,tn*) and abstractions [*a*]*t* are interpreted by an equivariant function [ ] in <sup>A</sup> <sup>×</sup> *<sup>G</sup>* <sup>→</sup> *<sup>G</sup>* such that *<sup>a</sup>*#sem[*a*]*<sup>g</sup>* always.

Signature interpretation is homomorphically extended to the set of terms as follows: Fix a *valuation function <sup>ς</sup>* that assigns to every variable *<sup>X</sup>* <sup>∈</sup> <sup>X</sup> an element of *<sup>G</sup>*. The interpretation of a term *<sup>t</sup>* under *<sup>ς</sup>*, *<sup>t</sup><sup>ς</sup>* , is defined as:

$$\begin{aligned} \left\lbrack \left[ a \right]\_{\varsigma} = \overline{a} \qquad \left[ \pi \cdot X \right]\_{\varsigma} = \pi \cdot \varsigma(X) \qquad \left[ \left[ a \right] t \right]\_{\varsigma} = \left[ \overline{a} \right] \left[ t \right]\_{\varsigma} \\ \left[ f(t\_1, \ldots, t\_n) \right]\_{\varsigma} = f^{\mathcal{X}}(\left[ t\_1 \right]\_{\varsigma}, \ldots, \left[ t\_n \right]\_{\varsigma}) \end{aligned}$$

**Definition 3 (Validity under** *<sup>ς</sup>***).** *Let* <sup>A</sup> *be any infinite subalgebra of* CORE *with domain <sup>A</sup> and <sup>ς</sup> a valuation function assigning for every variable <sup>X</sup>* <sup>∈</sup> <sup>X</sup> *an element of A. We say that:*


*Write* ∇ |<sup>=</sup> *<sup>s</sup>* <sup>≈</sup>*<sup>α</sup> <sup>t</sup>*(*resp.* ∇ |<sup>=</sup> *<sup>a</sup>*#*t*) *when* -∇ *<sup>s</sup>* <sup>≈</sup>*<sup>α</sup> <sup>t</sup><sup>ς</sup>* (*resp.* -∇ *<sup>a</sup>*#*t<sup>ς</sup>* ) *is valid for any valuation ς.*

A model of a nominal theory is an interpretation that validates all of its axiomatic judgements ∇ *s* ≈*<sup>α</sup> t*. It is easy to see that the interpretation we define above is a model of CORE. For the rest of the paper, we slightly abuse notation by calling CORE both the theory and its model making distinctions when necessary.

*Remark 1.* It is worth noticing the *syntactic* character of CORE: by interpreting atoms as themselves and since there are no equational axioms, we easily connect ∇ |= *a*#*t* and ∇ *a*#*t*. This behaviour is not the rule if equational axioms are considered. For instance, consider the theory LAM that axiomatises *β*-equality in the *λ*-calculus. It is a fact that *a*#sem(*λ*[*a*]*b*)*a* in LAM but there is no syntactic derivation for *a*#(*λ*[*a*]*b*)*a*. Furthermore, by completeness for equality derivation, we establish a connection between ∇ |= *s* ≈*<sup>α</sup> t* and ∇ *s* ≈*<sup>α</sup> t*.

There are alternative definitions of nominal terms where the syntax is manysorted. We chose to work with an unsorted syntax for simplicity; all the results below can be extended to the many-sorted case, indeed they are proved for any infinite subalgebra of the ground nominal algebra.

#### **3 Nominal Equational Problems**

In this section, we introduce *nominal equational problems* (NEPs) as our main object of study. A NEP is a fist-order formula built only with the predicates ≈*<sup>α</sup>*

and #. Their negations, denoted ≈*<sup>α</sup>* and #, are used to build disequations and non-freshness constraints. A *trivial disequation* is either *s* ≈*<sup>α</sup> s* or ⊥.

Intuitively, a non-freshness constraint *a* #*t* — read *a is not fresh for t* — states that there exists at least one instance of *t* where *a* occurs free. Similarly, for disequations: *s* ≈*<sup>α</sup> t* states that *s* and *t* are not *α*-equivalent.

**Definition 4.** *A* nominal system *is a formula defined by the following grammar:*

$$P, P' ::= \top \mid \perp \mid s \approx\_{\alpha} t \mid s \not\approx\_{\alpha} t \mid a \not\approx t \mid a \not\approx t \mid P \land P' \mid P \lor P'$$

In the next definition, we make a distinction between the set of variables occurring in a NEP: the mutually disjoint sets *W* = {*W*1*,...,W<sup>n</sup>*} and *Y* = {*Y*1*,...,Y<sup>m</sup>*} denote existentially and universally quantified variables, respectively. The former we call *auxiliary variables* and the latter *parameters.*

**Definition 5 (**NEP). *A* NEP *is a formula of the form below, where P is a nominal system.*

$$\mathcal{P} ::= \exists W\_1 \dots W\_n \forall Y\_1 \dots Y\_m : P$$

The set Fv(P) contains the free variables occurring in P. For the rest of the paper, we use the following implicit naming scheme for variables: *W* denotes an auxiliary variable, *Y* a parameter, *X* a free variable, and *Z* an arbitrary variable.

*Example 2.* Nominal disunification constraints [1] are pairs of the form P := ∃*WE* || *D,* where *E* is a finite set of nominal equations-in-context, i.e., *E* = *n i*=0 {*Δ<sup>i</sup> s<sup>i</sup>* ≈*<sup>α</sup> t<sup>i</sup>*} and *D* is a finite set of nominal disequations-in-context, *D* = *m j*=0 {∇*<sup>j</sup> u<sup>j</sup>* ≈*<sup>α</sup> v<sup>j</sup>*}. This problem is a particular NEP: taking the judgement *Δ s* ≈*<sup>α</sup> t* as *Δ* ⇒ *s* ≈*<sup>α</sup> t*, or yet as ¬*Δ* ∨ *s* ≈*<sup>α</sup> t* <sup>4</sup>, we obtain the formula:

$$\mathcal{P} := (\bigwedge\_{i=0}^{n} (\neg[\Delta\_i] \lor s\_i \approx\_{\alpha} t\_i)) \land (\bigwedge\_{j=0}^{m} (\neg[\nabla\_j] \lor u\_j \not\approx\_{\alpha} v\_j)),$$

where [*Δi*]*,* [∇*<sup>j</sup>* ] are conjunctions of freshness constraints in *Δi*, ∇*<sup>j</sup>* , respectively.

*Sufficient completeness*, that is, deciding whether a set of pattern (rules) covers all possible cases, is a well-known problem in functional programming. In the next example, we show how to naturally represent such problems as NEPs.

*Example 3.* Consider the function map which applies a function [*a*]*F* to every element of any list *L*. It may be defined by the rules below:

$$\mathcal{R}\_{\mathsf{map}} = \left\{ \begin{array}{c} \vdash \mathsf{map}([a]F, \mathsf{nil}) \rightarrow \mathsf{nil} \\ \vdash \mathsf{map}([a]F, \mathsf{cons}(X, L)) \rightarrow \mathsf{cons}(F\{a \mapsto X\}, \mathsf{map}([a]F, L)), \end{array} \right\}$$

<sup>4</sup> Similarly, for disequations.

where {*a* → } is a binary term-former representing (explicit) substitutions; see [14, Example 43] for more details. Since we are not imposing a type discipline on nominal terms it is possible to construct ill-typed terms, for instance map(*a,* [*a*]*t*). In what follows we ignore those expressions by noticing that a type discipline will not allow such constructions. Then sufficient completeness can be checked using the following NEP:

> ∀*Y*1*Y*2*Y*3*L*- : map([*a*]*F, L*) ≈*<sup>α</sup>* map([*b*]*Y*1*,* nil)∧ map([*a*]*F, L*) ≈*<sup>α</sup>* map([*b*]*Y*2*,* cons(*Y*3*, L*- ))*,*

If the problem has a solution then Rmap is not complete, and the solution indicates the missing pattern cases in the definition.

*Solutions of Nominal Equational Problems.* We are interested in solutions for NEPs in the ground nominal algebra. From now on, A denotes an infinite subalgebra of CORE with domain *A*. Below we define solutions using idempotent substitutions, which can be seen as a representation for valuations that map variables to elements of the ground term algebra.

We first extend the interpretation function under a valuation *<sup>ς</sup>* -·*<sup>ς</sup>* (see Section 2) to the negated form of freshness and *α*-equality constraints.

**Definition 6.** *Let ς be a (fixed but arbitrarily given) valuation. A negative constraint a #t (resp. s* ≈*<sup>α</sup> t) is valid under ς when:*


In standard unification algorithms, idempotent substitutions are used as a compact representation of a set of valuations in the ground term algebra. Similarly, given a valuation in the ground term algebra, one can build a ground substitution representing it. In the case of the ground nominal algebra, where elements are *α*-equivalence classes of terms, the representative is generally not unique, but any representative can be used.

**Definition 7.** *Given a substitution σ* = [*X*1*/t*1*,...,Xn/tn*]*, for any valuation <sup>ς</sup>, we denote by <sup>ς</sup><sup>σ</sup> the valuation such that <sup>ς</sup><sup>σ</sup>*(*X*) = *<sup>ς</sup>*(*X*) *if <sup>X</sup>* ∈ dom(*σ*)*, and <sup>ς</sup><sup>σ</sup>*(*X*) = -*Xσ<sup>ς</sup> otherwise.*

*Given a valuation <sup>ς</sup>* = [*X<sup>i</sup>* → *<sup>g</sup><sup>i</sup>* <sup>|</sup> *<sup>X</sup><sup>i</sup>* <sup>∈</sup> <sup>X</sup>*, g<sup>i</sup>* <sup>∈</sup> *<sup>A</sup>*]*, and a finite set* <sup>X</sup> *of variables, we denote by σ<sup>ς</sup>* <sup>X</sup> *any ground substitution such that for each <sup>X</sup><sup>i</sup>* ∈ X *, <sup>σ</sup>*(*Xi*) = *<sup>t</sup>i, if <sup>g</sup><sup>i</sup>* <sup>=</sup> *<sup>t</sup><sup>i</sup><sup>ς</sup> . We say that <sup>σ</sup><sup>ς</sup>* <sup>X</sup> *is a grounding substitution for* <sup>X</sup> *.*

The next lemma states that under mild conditions we can extend substitutions to valuations preserving semantic equality.

**Lemma 1.** *Given an idempotent substitution σ* = [*X*1*/t*1*,...,Xn/tn*] *and a valuation <sup>ς</sup> we have: sσ<sup>ς</sup>* <sup>=</sup> *<sup>s</sup><sup>ς</sup><sup>σ</sup> .*

The next definition allows us to use idempotent substitutions to represent solutions of constraints.

**Definition 8 (Constraint** <sup>A</sup>**-validation).** *Let <sup>σ</sup> be an idempotent substitution whose domain includes all the variables occurring in a constraint C. Then σ* <sup>A</sup>*-validates <sup>C</sup> iff* -*<sup>C</sup><sup>ς</sup><sup>σ</sup> is valid in* <sup>A</sup> *for any valuation <sup>ς</sup>.*

We now extend semantic validity to the syntax of systems. The interpretation for the logical connectives is defined as expected.

**Definition 9 (**A**-validation).** *For an idempotent substitution <sup>σ</sup> whose domain includes all variables occurring in a system P, we say that σ* A*-*validates *P iff*


Solutions of equational problems instantiate free variables and satisfy existential and universal requirements for auxiliary variables and parameters, respectively. To define this notion, we extend the domain of the substitution to include also existential and universally quantified variables as follows.

**Definition 10 (**A**-Solution).** *Let* <sup>P</sup> <sup>=</sup> <sup>∃</sup>*W*∀*<sup>Y</sup>* : *<sup>P</sup> be a NEP. Let <sup>σ</sup> be an idempotent substitution such that* dom(*σ*) = *Fv*(P)*. Then σ is an* A*-*solution *of* P *iff there is a ground substitution δ, where* dom(*δ*) = *W, such that for all ground substitution λ, where* dom(*λ*) = *Y , σδλ* A*-validates P. The set of* A*-solutions of* P *is denoted* SA(P)*, or simply* S(P) *if* A *is clear from the context.*

*Example 4.* Consider the signature *Σ*nat := {zero : 0*,*suc : 1} for natural numbers, and the nominal initial algebra Anat with zero and suc interpreted as expected. The problem P := ∃*W*∀*Y* : *W* ≈*<sup>α</sup>* suc(*Y* ) has id as solution. Indeed, taking for example *δ* = [*W/*zero] or *δ* = [*W/a*] and any choice of *λ* (dom(*λ*) = {*Y* }), the composition id*δλ* A-validates *W* ≈*<sup>α</sup>* suc(*Y* ).

In Definition 10, *δ* is the substitution that instantiates auxiliary variables, so there can be many (possibly infinite) number of such *δ*'s.

**Lemma 2 (Equivariance of Solutions).** *If <sup>σ</sup> is an* <sup>A</sup>*-*solution *of the* NEP P *then for any permutation π, π* · *σ (defined by* [*Xi/π* · *ti*]*, as expected) is an* A*-*solution *of π*·P*. In particular, if an* A*-*solution *contains an atom not occurring in* P*, that atom can be swapped for any other atom not occurring in* P*.*

Lemma 2 is a direct consequence of the fact that interpretations are equivariant, and shows that solutions are closed by permutation. It allows us to use permutations to represent infinite choices for atoms in solutions.

*Example 5.* Consider the problem ∀*Y* : *X* ≈*<sup>α</sup> λ*[*a*]*Y* , built over the signature of Example 1. The set of solutions contains *σ* = [*X/a*] as well as (*a b*)·[*X/a*]=[*X/b*]; for any other atom *b*.

**Lemma 3 (Closure by Instantiation).** *If <sup>σ</sup> is an* <sup>A</sup>*-*solution *of the* NEP P = ∃*W*∀*Y* : *P then any idempotent substitution σ obtained as an instance of σ such that* dom(*σ*- ) = dom(*σ*) *is also an* A*-solution of* P*. In particular, for any such ground instance σ of σ there is a ground substitution δ, where* dom(*δ*) = *W, such that for all ground substitution λ, where* dom(*λ*) = *Y , σ*- *δλ* A*-validates P.*

*Proof.* By definition of A-solution, to show that *σ* is an A-solution of P we need to consider all the valuations of the form *ς<sup>σ</sup>*- *δλ* as indicated in Definitions 8, 9, 10. The result follows from the fact that for any valuation *ς<sup>σ</sup>*- *δλ* there exists an equivalent valuation *ςσδλ* by Lemma 1.

#### **4 A rule-based procedure**

In this section we present a set of simplification rules to solve NEPs. A simplification step, denoted P =⇒ P- , transforms P into an equivalent problem P from which solutions are easier to extract.

#### **4.1 Simplification Rules**

Rules may have application conditions (rule controls) that define a strategy of simplification. Our strategy gives priority to rules according to their role. We split the rules into groups R*<sup>i</sup>* as shown in Figures 2, 3 and 4: R<sup>1</sup> eliminates trivial constraints, R<sup>2</sup> deals with clash and occurs check, R<sup>3</sup> eliminates unneeded quantifiers, R<sup>4</sup> and R<sup>5</sup> decompose positive and negative constraints, respectively, R<sup>6</sup> eliminates parameters and R<sup>7</sup> instantiates variables. The Explosion and Elimination of Disjunction rules in R<sup>8</sup> search for solutions as explained below. Finally, R<sup>9</sup> eliminates the remaining universal quantifiers. A rule *R* ∈ R*<sup>i</sup>* can only be applied if no rules from R*<sup>j</sup>* , where *j<i*, can be applied.

Since we are dealing with formulas that contain disjunction and conjunction connectives, we need to take into account the standard Boolean axioms. To simplify, instead of working modulo the Boolean axioms we apply a Boolean normalisation step before a rule is applied. Following Comon and Lescanne [10], we choose to take *conjunctive normal form*: Before the application of each rule P is reduced to a conjunction of disjunctions.

The explosion rule creates new branches by instantiating variables considering all possible ways of constructing terms (i.e., each *f* ∈ *Σ*, abstractions and atoms). Note that *Σ* ∪ Atoms(*P*) ∪ {*a*- } is a finite set (we can represent all possible constructions with a finite number of cases), so the rule is finitely branching.

The rule Elimination of Disjunctions also builds a finite number of branches. Therefore, our procedure builds a finitely branching tree of problems to be solved.

Rules R1-R<sup>8</sup> are not sufficient to eliminate all parameters from a NEP (see Example 6) in contrast with the syntactic case [7], where similar rules produce parameterless normal forms. This is because we are dealing with both freshness and *α*-equality. Indeed, normal forms for rules R1-R<sup>8</sup> may contain parameters, but only in disjunctions involving both freshness and equality constraints for the same parameter as the following lemma states. The rules in R<sup>9</sup> (Figure 4) are introduced to deal with this problem.

R<sup>1</sup> : Trivial Rules (*T*1) *t* ≈*<sup>α</sup> t* =⇒ (*T*2) *t* ≈*<sup>α</sup> t* =⇒ ⊥ (*T*3) *a* ≈*<sup>α</sup> b* =⇒ ⊥ (*T*4) *a*#*b* =⇒ (*T*5) *a*#*a* =⇒ ⊥ (*T*6) *a* #*a* =⇒ (*T*7) *a* #*b* =⇒ ⊥ (*T*8) *a*#*t* ∧ *a* #*t* =⇒ ⊥ (*T*9) *a*#*t* ∨ *a* #*t* =⇒ R2: Clash and Occurrence Check Rules (*CL*1) *s* ≈*<sup>α</sup> t* =⇒ (*CL*2) *s* ≈*<sup>α</sup> t* =⇒ ⊥ **Conditions for** (*CL*1) **and** (*CL*2)**:** *s*() = *t*() and neither is a moderated variable. (*O*1) *π* · *Z* ≈*<sup>α</sup> t* =⇒ ⊥ (*O*2) *π* · *Z* ≈*<sup>α</sup> t* =⇒ **Conditions for** (*O*1) **and** (*O*2)**:** *Z* ∈ vars(*t*) and *t* ≡ *π*- · *Z* R3: Elimination of parameters and auxiliary unknowns. (*C*1) ∀*Y,Y* : *P* =⇒ ∀*Y* : *P, Y /*∈ vars(*P*) (*C*2) ∃*W,W* : *P* =⇒ ∃*W* : *P, W /*∈ vars(*P*) (*C*3) ∃*W,W* : *π* · *W* ≈*<sup>α</sup> t* ∧ *P* =⇒ ∃*W* : *P, W /*∈ vars(*P, t*) R4: Equality and freshness simplification (*E*1) *π* · *X* ≈*<sup>α</sup> γ* · *X* =⇒ ∧ ds(*π, γ*)#*X* (*E*2) [*a*]*t* ≈*<sup>α</sup>* [*a*]*u* =⇒ *t* ≈*<sup>α</sup> u* (*E*3) [*a*]*t* ≈*<sup>α</sup>* [*b*]*u* =⇒ (*b a*) · *t* ≈*<sup>α</sup> u* ∧ *b*#*t* (*E*4) *f*(*t* ˜) <sup>≈</sup>*<sup>α</sup> <sup>f</sup>*(˜*u*) =⇒ ∧*<sup>i</sup> <sup>t</sup><sup>i</sup>* <sup>≈</sup>*<sup>α</sup> <sup>u</sup><sup>i</sup>* (*F*1) *<sup>a</sup>*#*<sup>π</sup>* · *<sup>X</sup>* <sup>=</sup><sup>⇒</sup> *<sup>π</sup>*−<sup>1</sup>(*a*)#*X, π* <sup>=</sup> id (*F*2) *a*#[*a*]*t* =⇒ (*F*3) *a*#[*b*]*t* =⇒ *a*#*t* (*F*4) *a*#*f*(*t*1*,...,tn*) =⇒ ∧*ia*#*t<sup>i</sup>* R5: Disunification (*DC*) *f*(*t* ˜) ≈*<sup>α</sup> <sup>f</sup>*(˜*u*) =⇒ ∨*<sup>i</sup> <sup>t</sup><sup>i</sup>* ≈*<sup>α</sup> <sup>u</sup><sup>i</sup>* (*D*1) *π* · *X* ≈*<sup>α</sup> γ* · *X* =⇒ ∨*<sup>i</sup>* ds(*π, γ*) #*X* (*D*2) [*a*]*t* ≈*<sup>α</sup>* [*a*]*u* =⇒ *t* ≈*<sup>α</sup> u* (*D*3) [*a*]*t* ≈*<sup>α</sup>* [*b*]*u* =⇒ (*b a*) · *t* ≈*<sup>α</sup> u* ∨ *b* #*t* (*NF*1) *a* #*<sup>π</sup>* · *<sup>X</sup>* <sup>=</sup><sup>⇒</sup> *<sup>π</sup>*−<sup>1</sup>(*a*) #*X, π* = id (*NF*2) *a* #[*a*]*t* =⇒ ⊥ (*NF*3) *a* #[*b*]*t* =⇒ *a* #*t* (*NF*4) *a* #*f*(*t* ˜) =⇒ ∨*i<sup>a</sup>* #*ti* R6: Simplification of Parameters (*U*1) ∀*Y,Y* : *P* ∧ *π* · *Y* ≈*<sup>α</sup> t* =⇒ ⊥ if *Y* ∈ vars(*t*) (*U*2) <sup>∀</sup>*<sup>Y</sup>* : *<sup>P</sup>* <sup>∧</sup> (*<sup>π</sup>* · *<sup>Y</sup>* ≈*<sup>α</sup> <sup>t</sup>* <sup>∨</sup> *<sup>Q</sup>*) =⇒ ∀*<sup>Y</sup>* : *<sup>P</sup>* <sup>∧</sup> *<sup>Q</sup>*[*Y /π*−<sup>1</sup> · *<sup>t</sup>*]*,* if *Y /*<sup>∈</sup> vars(*t*)*, Y* <sup>∈</sup> *<sup>Y</sup>* (*U*3) ∀*Y,Y* : *P* ∧ *π* · *Y* ≈*<sup>α</sup> t* =⇒ ⊥*,* if *π* · *Y* ≡ *t* (*U*4) ∀*Y* : *P* ∧ (*π*<sup>1</sup> · *Z*<sup>1</sup> ≈*<sup>α</sup> t*<sup>1</sup> ∨···∨ *π<sup>n</sup>* · *Z<sup>n</sup>* ≈*<sup>α</sup> t<sup>n</sup>* ∨ *Q*) =⇒ ∀*Y* : *P* ∧ *Q* (*U*5) ∀*Y,Y* : *P* ∧ *a*#*Y* =⇒ ⊥ (*U*6) ∀*Y,Y* : *P* ∧ *a* #*Y* =⇒ ⊥

	- **–** Each equation in the disjunction contains at least one occurrence of a parameter and *π<sup>i</sup>* · *Z<sup>i</sup>* ≡ *t<sup>i</sup>* for each *i* = 1*,...,n*.
	- **–** *Q* does not contain any parameter.

R7: Instantiation Rules

(*I*1) *<sup>π</sup>* · *<sup>Z</sup>* <sup>≈</sup>*<sup>α</sup> <sup>t</sup>* <sup>∧</sup> *<sup>P</sup>* <sup>=</sup><sup>⇒</sup> *<sup>Z</sup>* <sup>≈</sup>*<sup>α</sup> <sup>π</sup>*−<sup>1</sup> · *<sup>t</sup>* <sup>∧</sup> *<sup>P</sup>*[*Z/π*−<sup>1</sup> · *<sup>t</sup>*]


$$(I\_2) \; \pi \cdot Z \; \#\_{\alpha} t \lor P \implies Z \; \#\_{\alpha} \; \pi^{-1} \cdot t \lor P[Z/\pi^{-1} \cdot t]$$


R8: Explosion and Elimination of Disjunction

(*ED*1)∀*Y* : *P* ∧ (*P*<sup>1</sup> ∨ *P*2) =⇒ ∀*Y* : *P* ∧ *P*1*,* if vars(*P*1) ∩ *Y* = ∅ or vars(*P*2) ∩ *Y* = ∅*.*

(*ED*2) ∀*Y*1*, Y*<sup>2</sup> : *P* ∧ (*P*<sup>1</sup> ∨ *P*2) =⇒ ∀*Y*1*, Y*<sup>2</sup> : *P* ∧ *P*1*,* if vars(*P*1) ∩ *Y*<sup>2</sup> = ∅ and vars(*P*2) ∩ *Y*<sup>1</sup> = ∅

(*Exp*) ∃*W*∀*Y* : *P* =⇒ ∃*W*-∃*W*∀*Y* : *P* ∧ *X* ≈*<sup>α</sup> t,* for *t* = *f*(*W*-) or *t* = [*a*]*W*or *t* = *a*

#### **Conditions for** (*Exp*)**:**


#### **Fig. 3.** Globally Preserving Rules

*Example 6.* Both P = *a*#*Y*<sup>1</sup> ∨ *Y* ≈*<sup>α</sup> f*(*Y*1) and P = *a*#*Y*<sup>1</sup> ∨ *a* #*Y* ∨ *Y*<sup>1</sup> ≈*<sup>α</sup> f*(*Y* ) are irreducible: neither (*U*4) nor (*ED*1) apply since all the disjuncts contain parameters; (*ED*2) does not apply since each constraint has a parameter that occurs in another constraint; (*Exp*) does not apply because there is no equation or disequation with a free or existentially quantified variable in one side.

The following lemma characterises the irreducible disjunctions with respect to rules R1-R<sup>8</sup> where parameters may remain.

**Lemma 4.** *Let <sup>P</sup> be a disjunction of constraints irreducible w.r.t.* <sup>R</sup>1*-*R8*. For each parameter Y such that P* = *a*#*Y* ∨ *Q (resp. P* = *a #Y* ∨ *Q), for some atom a, the following holds:*


*Proof.* In an irreducible disjunction of constraints at least one of the sides of equations (or disequations) is a variable, otherwise we could simplify the equation/disequation.

**Condition 1.** It holds, otherwise we could apply (*T*9). **Condition 2.** It holds, otherwise we could apply (*ED*2).

**Condition 3.** If *<sup>Q</sup>* had an equation of the form *<sup>X</sup>* <sup>≈</sup>*<sup>α</sup> <sup>t</sup>*, for some free or existentially quantified variable, then *t* could not contain a parameter, otherwise we could apply rule (Exp). Therefore, *t* = *t*[*Z*1*,...,Zn*], for *n* ≥ 0 where each *Z<sup>i</sup>*


(*U*7) ∀*Y,Y* : *P* ∧ (*a*#*Y* ∨ *Q*) =⇒ ⊥ if R1-R<sup>8</sup> do not apply (so *Q* does not contain *a* #*Y* ) and *Y* ∈ vars(*Q*). (*U*8) ∀*Y,Y* : *P* ∧ (*a* #*Y* ∨ *Q*) =⇒ ⊥ if R1-R<sup>8</sup> do not apply (so *Q* does not contain *a*#*Y* ) and *Y* ∈ vars(*Q*).

**Fig. 4.** Preserving Rules for (non)freshness constraints with parameters.

is either a free or existentially quantified variable, and one could apply rule *ED*1. Thus, if an equation exists, one of the sides has to be a parameter, say *Y* ≈*<sup>α</sup> t*, and *Y* cannot occur in *t* otherwise rule *O*<sup>2</sup> applies.

**Condition 4.** If *<sup>Q</sup>* were to contain a disequation, say *<sup>X</sup>* ≈*<sup>α</sup> <sup>t</sup>* then *<sup>t</sup>* could not contain a parameter, otherwise we could apply (Exp) as above, but then we could apply rule (*ED*1). Therefore, if *Q* were to contain a disequation, it would be of the form *Y* ≈*<sup>α</sup> t*, then it would either reduce with (*O*2) or with (*U*2). Thus, *Q* does not contain disequations. Similary, if *Q* contained a primitive freshness constraint for a free or existentially quantified variable then (*ED*1) would apply.

The remaining disjunctions with parameters can be simplified using the rules in R9, since they will not produce solutions (as shown in Theorem 1).

We end this section with an example of application of the simplification rules.

*Example 7.* Let P be a NEP, using the signature from Example 1, as follows:

$$\mathcal{P} = \forall Y : \lambda[a]X \not\simeq\_{\alpha} \lambda[a]\lambda[a]Y \stackrel{DC}{\Longrightarrow} \forall Y : [a]X \not\simeq\_{\alpha} [a]\lambda[a]Y \stackrel{D\_2}{\Longrightarrow} \forall Y : X \not\simeq\_{\alpha} \lambda[a]Y$$

Rules in R1-R<sup>7</sup> cannot be applied and the explosion rule produces six problems:

P<sup>1</sup> = ∃*W*1∀*Y* : *X* ≈*<sup>α</sup> λ*[*a*]*Y* ∧ *X* ≈*<sup>α</sup> λW*<sup>1</sup> P<sup>2</sup> = ∃*W*1*, W*2∀*Y* : *X* ≈*<sup>α</sup>* [*a*]*Y* ∧ *X* = *W*1*W*<sup>2</sup> P<sup>3</sup> = ∃*W*∀*Y* : *X* ≈*<sup>α</sup>* [*a*]*Y* ∧ *X* = [*a*]*W* P<sup>4</sup> = ∃*W*∀*Y* : *X* ≈*<sup>α</sup>* [*a*]*Y* ∧ *X* = [*b*]*W* P<sup>5</sup> = ∃*W*∀*Y* : *X* ≈*<sup>α</sup>* [*a*]*Y* ∧ *X* = *a* P<sup>6</sup> = ∀*Y* : *X* ≈*<sup>α</sup>* [*a*]*Y* ∧ *X* = *b*

Reducing the first problem we get:

$$\begin{split} \mathcal{P}\_{1} & \stackrel{I\_{1}}{\Longrightarrow} \exists W\_{1} \forall Y : \lambda W\_{1} \ \not\models\_{\alpha} \lambda[a]Y \wedge X \approx\_{\alpha} \lambda W\_{1} \\ & \stackrel{\scriptstyle D}{\Longrightarrow} \exists W\_{1} \forall Y : W\_{1} \ \not\models\_{\alpha} [a]Y \wedge X \approx\_{\alpha} \lambda W\_{1} \\ & \stackrel{\scriptstyle Exp}{\Longrightarrow} \exists W\_{1}W\_{2} \forall Y : W\_{1} \ \not\models\_{\alpha} [a]Y \wedge X \approx\_{\alpha} \lambda W\_{1} \wedge W\_{1} \approx\_{\alpha} \lambda W\_{2} \\ & \stackrel{\scriptstyle I\_{1}}{\Longrightarrow} \exists W\_{1}W\_{2} \forall Y : \lambda W\_{2} \not\models\_{\alpha} [a]Y \wedge X \approx\_{\alpha} \lambda W\_{1} \wedge W\_{1} \approx\_{\alpha} \lambda W\_{2} \\ & \stackrel{\scriptstyle C L\_{1}}{\Longrightarrow} \exists W\_{1}W\_{2} \forall Y : X \approx\_{\alpha} \lambda W\_{1} \wedge W\_{1} \approx\_{\alpha} \lambda W\_{2} \\ & \stackrel{\scriptstyle I\_{1}}{\Longrightarrow} \exists W\_{1}W\_{2} : X \approx\_{\alpha} \lambda \lambda W\_{2} \wedge W\_{1} \approx\_{\alpha} \lambda W\_{2}. \end{split}$$

At this point P<sup>1</sup> has reached a normal form without any parameter. Solutions of P<sup>1</sup> can be easily obtained by taking any instance of *X* of the form *λλt*. It is easy to check that this choice indeed generates solutions of P. Similar reductions apply to P*i*, 2 ≤ *i* ≤ 6.

As we will see in the next section, application of such simplification rules is *well-behaved* in the sense that we do not loose any solution along the way.

#### **4.2 Soundness and Preservation of Solutions**

The next step is to ensure that the application of rules does not change the set of solutions of an equational problem.

**Definition 11 (Soundness and preservation of solution).** *Let* <sup>A</sup> *be any infinite subalgebra of* CORE*.*


$$\mathcal{S}(\mathcal{P}) \subseteq \bigcup\_{\substack{\mathcal{P} \to\_{\mathcal{R}} \pi \cdot \mathcal{P}\_i \\ \mathsf{supp}(\pi) \cap \mathsf{Attoms}(\mathcal{P}) = \emptyset}} \mathcal{S}(\mathcal{P}\_i).$$

All our rules, except those in R8, are sound and preserving (Theorem 1). The rules in R<sup>8</sup> create branches in the derivation tree; they are sound and only globally preserving (Theorem 2).

**Theorem 1.** *The rules in* <sup>R</sup><sup>1</sup> *to* <sup>R</sup><sup>7</sup> *and the rules in* <sup>R</sup><sup>9</sup> *are* <sup>A</sup>*-sound and* A*-preserving for any infinite subalgebra* A *of* CORE*.*

*Proof.* **Rules in <sup>R</sup>**1, **<sup>R</sup>**2, and **<sup>R</sup>**<sup>3</sup> : soundness and preservation of solutions are easy to deduce. For instance, for clash rules, (*CL*1) and (*CL*2), it follows by inspection of deduction rules that the judgement *sγ* ≈*<sup>α</sup> tγ* is not derivable for any valuation *ς* and corresponding grounding substitution *γ* = *σ<sup>ς</sup>* vars(*s,t*) (see Definition 7) if the root constructors of *s* and *t* are different (hence every *γ* is a solution for the disequation). For (*C*3) observe that we can take [*W/t*] as a witness for *W* on a validation for ∃*W* : *P*, if *W /*∈ vars(*P, t*).

**Rules in <sup>R</sup>**<sup>4</sup> and **<sup>R</sup>**5. It follows from soundness and preservation of simplification rules in [14]. We use the fact that nominal equality and freshness rules from Fig. 1 are reversible; for instance, let *γ* be a grounding substitution, a judgement *f*(*s*˜)*γ* ≈*<sup>α</sup> f*(*u*˜)*γ* fails, which makes *f*(*s*˜)*γ* ≈*<sup>α</sup> f*(*u*˜)*γ* valid, iff one of the premises *siγ* ≈*<sup>α</sup> uiγ* does not hold.

**Rules in <sup>R</sup>**6: The result is straightforward for rules *<sup>U</sup>*<sup>1</sup> and *<sup>U</sup>*3.

*U*2. To prove soundness for *U*<sup>2</sup> notice that the solution set of a conjunction is the intersection of the solution set of each of its members. We have to show that every solution of *<sup>Q</sup>*[*Y /π*−<sup>1</sup> · *<sup>t</sup>*] is a solution of (*<sup>π</sup>* · *<sup>Y</sup>* ≈*<sup>α</sup> <sup>t</sup>* <sup>∨</sup> *<sup>Q</sup>*). Let *<sup>γ</sup>* be a solution of *<sup>Q</sup>*[*Y /π*−<sup>1</sup> · *<sup>t</sup>*] and take any substitution *<sup>λ</sup>* satisfying the conditions of Definition 10. So (*Q*[*Y /π*−<sup>1</sup> · *<sup>t</sup>*])*γλ* is valid and we need to show the validity of

$$(\pi \cdot Y \not\simeq\_{\alpha} t) \gamma \lambda \vee Q \gamma \lambda. \tag{1}$$

For each such *λ* there are two possible cases: First, *π* · *Y λ* ≈*<sup>α</sup> tγλ* (note that *λ* is a ground substitution so both sides of this equation are ground); then we have that *γλ* = *γλ*- [*Y /π*−<sup>1</sup> · *tγλ*]. By hypothesis, *γλ* validates *<sup>Q</sup>*[*Y /π*−<sup>1</sup> · *<sup>t</sup>*] so *γλ*- [*Y /π*−<sup>1</sup> · *tγλ*] validates *<sup>Q</sup>*. Second; *<sup>π</sup>* · *Y λ* <sup>≈</sup>*<sup>α</sup> tγλ*, then *γλ* validates *π* · *Y* ≈*<sup>α</sup> t*. Hence *γ* a solution of (1).

To prove preservation for *U*2, take *γ* a solution of ∀*Y,Y* : *π* · *Y* ≈*<sup>α</sup> t* ∨ *Q*, we need to show that *<sup>γ</sup>* is also a solution of <sup>∀</sup>*Y,Y* : *<sup>Q</sup>*[*Y /π*−<sup>1</sup> · *<sup>t</sup>*]. Notice that *<sup>γ</sup>* is a solution of ∀*Y,Y* : *π* · *Y* ≈*<sup>α</sup> t* or ∀*Y,Y* : *Q* but it clearly cannot solve the first problem. Hence, *γ* solves ∀*Y,Y* : *Q*. By Definition 10, for all substitutions *λ* with domain *Y* ∪ {*Y* } we have that *λγ* validates *Q*. In particular, the substitution [*Y /π*−<sup>1</sup> · *tγ*]*λγ* which is equivalent to [*Y /π*−<sup>1</sup> · *<sup>t</sup>*]*λγ* (since *<sup>γ</sup>* is away from *<sup>λ</sup>*) must also validate *<sup>Q</sup>*. Consequently, *λγ* validates (*Q*[*Y /π*−<sup>1</sup> · *<sup>t</sup>*]).

*U*4. Soundness for this rule follows trivially. For preservation of solutions, we show that any solution of ∀*Y* : *<sup>i</sup> Z<sup>i</sup>* ≈*<sup>α</sup> t<sup>i</sup>* ∨*Q* is a solution of ∀*Y* : *Q*. The shape of the first problem induces a requirement that the disjunction *<sup>i</sup> Z<sup>i</sup>* ≈*<sup>α</sup> t<sup>i</sup>* does not have a solution. To show this we prove that the negated form - *<sup>i</sup> Z<sup>i</sup>* ≈*<sup>α</sup> t<sup>i</sup>* has at least one solution. Notice that such a solution is a witness for the failure of *<sup>i</sup> Z<sup>i</sup>* ≈*<sup>α</sup> ti*, since all of those equations have at least one parameter. Lemma 5 shows that this is true.

*U*<sup>5</sup> and *U*6. We need to show that every solution of ∀*Y,Y* : *P* ∧ *a*#*Y* is also a solution of ⊥, i.e., no such solution exists for the lhs of the rule. In fact, the existence of such *γ* would imply that (taking *λ* = [*Y /a*]) *a*#*a* which is impossible. For *U*<sup>6</sup> we do the same reasoning with *λ* = [*Y /*[*a*]*a*].

**Rules in <sup>R</sup>**7. Soundness and preservation of (*I*1) has been proved in previous works, since rule (*I*1) is used in standard nominal unification algorithms [23]. Rule (*I*2) is a direct adaptation of the rule used in the standard (syntactic) case, proved sound and preserving in [10]. Indeed, *γ* ∈ S(*π* · *Z* ≈*<sup>α</sup> t* ∨ *P*) if, and only if, for any grounding instance *γ* of *γ*, *γ*- ∈ S(*<sup>Z</sup>* ≈*<sup>α</sup> <sup>π</sup>*−<sup>1</sup> ·*t*) or *<sup>γ</sup>*- ∈ S(*P*) (by Lemma 3). Finally, notice that *<sup>γ</sup>* ∈ S(*P*) \ S(*<sup>Z</sup>* ≈*<sup>α</sup> <sup>π</sup>*−<sup>1</sup> ·*t*) if and only if *<sup>γ</sup>* ∈ S(*P*[*Z/π*−<sup>1</sup> ·*t*]).

**Rules in <sup>R</sup>**9. Soundness follows trivially, since <sup>⊥</sup> has no solution. We show below that *U*<sup>7</sup> is A-preserving; the proof is analogous for rule (*U*8).

Let P = ∃*W*∀*Y,Y* : *P* ∧ (*a*#*Y* ∨ *Q*) where *Q* is fully reduced by R1-R8, *Y* ∈ vars(*Q*) and *Q* does not contain *a* #*Y* . We prove that P does not have solutions by induction on the number of freshness constraints in *a*#*Y* ∨ *Q*.

**Base case:** *Q* contains just equational constraints, each containing at least one occurrence of the parameter *Y* , as specified in Lemma 4. Suppose by contradiction that there exists an A-solution *γ*. Thus, *γ* is away from *Y* ∪ {*Y* }, dom(*γ*) = *X* = Fv(P), there is a ground substitution *δ* with dom(*δ*) = *W* and for all *λ* away from *X, W*, with dom(*λ*) = *Y* ∪ {*Y* }, *γδλ* A-validates *P* ∧ (*a*#*Y* ∨ *Q*). Then, it A-validates both *P* and (*a*#*Y* ∨ *Q*). The latter implies that *γδλ* A-validates *Q* for every *λ* (but then *Q* has a solution, which is impossible due to the form of the equational constraints) or *Q* implies *a* #*Y* (since there is at least one *f* ∈ *Σ* such that *f* : *n* and *n >* 0, and therefore *a*#*Y* is false for an infinite number of ground terms *Y λ*). The latter is impossible since *a*#*Y* is defined as *a* ∈ supp(*Y* ), which is defined as (*a a*- ) · *Y* = *Y* for a new *a*- , and reduced problems cannot contain fixed point equations or their negations (these are simplified using rules (*E*1) and (*D*1), respectively).

The inductive step is proved similarly, using Lemma 4 as in the base case to deduce that the constraints in *Q* cannot entail *a* #*Y* .

**Theorem 2.** *Let* <sup>A</sup> *be any infinite subalgebra of* CORE*.*


Lemma 5 guarantees the existence of a solution for a conjunction of non-trivial disequations as long as the algebra considered has sufficient ground terms.

**Lemma 5.** *Let* <sup>P</sup> *be a conjunction of non-trivial disequations. Let* <sup>A</sup> *be any infinite subalgebra of* CORE*. Then* P *has at least one solution in* A*.*

*Proof.* The proof proceeds by induction on the number of distinct variables occurring in P. For the base case P has no variables. Then every substitution solves P, since by hypothesis P does not contain any trivial disequation *t* ≈*<sup>α</sup> t*.

Assume the result holds for problems with *m* − 1 variables. Let P be a conjunction of non-trivial disequations such that |vars(*P*)| = *m* and *X* ∈ vars(*P*). For each disequation *s* ≈*<sup>α</sup> t* ∈ P, the equation *s* ≈*<sup>α</sup> t* has at most one solution (modulo *α*-renaming) when the variables distinct from *X* are considered as constants. Let S the set of such solutions for all these equations. Since *A* (the domain of A) is infinite, there exists *a* ∈ A such that [*X/a*] ∈ S */* . Therefore, [*X/a*] is a solution for P. Now, consider the problem P- = P[*X/a*] which has *m* − 1 variables. The result follows by induction hypothesis.

#### **4.3 Termination**

To prove termination we define a measure function for NEPs that strictly decreases with each application of a rule. The measure uses the following auxiliary functions:

**Definition 12 (Auxiliary Functions).** *The function* sizePar(*t*) *denotes the sum of the sizes of the parameter positions in t:*

$$\mathtt{sizePar}(t) := \sum\_{p\_j \in \mathtt{PosPar}(t)} |p\_j|$$

*where* PosPar(*t*) = {*p<sup>j</sup>* | *t*|*<sup>p</sup><sup>j</sup>* = *Y<sup>i</sup> for some parameter Y<sup>i</sup>*}*.*

*Given a disjunction of equations, disequations, freshness, and negated freshness constraints d* = *C*<sup>1</sup> ∨ *...* ∨ *C<sup>n</sup> we define auxiliary functions φ*<sup>1</sup> *and φ*<sup>2</sup> *over d.*

	- *(a)* MSP(*C*)=0 *if C is an equation or disequation and a member of C is a solved parameter (a parameter Y is* solved *in d if there exists a disequation Y* ≈*<sup>α</sup> u in d and Y occurs only once in d); or if C is a primitive freshness or a primitive negated freshness constraint;*

*(b) otherwise,* MSP(*s* ≈*<sup>α</sup> t*) = MSP(*s* ≈*<sup>α</sup> t*) = *max*(sizePar(*s*)*,* sizePar(*t*)) *and* MSP(*a*#*t*) = MSP(*a #t*) = sizePar(*t*)*.*

**Definition 13 (Measure).** *Let* <sup>P</sup> <sup>=</sup> <sup>∃</sup>*W*∀*Y d*1∧*...*∧*d<sup>n</sup> be a nominal equational problem in conjunctive normal form.* P *is measured using the tuple:*

$$\Phi(\mathcal{P}) = (N\_u, N\_d, \psi\_1(\mathcal{P}), M, \psi\_2(\mathcal{P})), \text{ where } \mathcal{P}$$


Using this measure we can prove the termination of the simplification process.

**Theorem 3.** *The procedure defined in Section 4 for application of rules, expressed as* R := R1R<sup>2</sup> *...* R9*, terminates.*

#### **5 Nominal Equational Solved Forms**

We have shown that the simplification process terminates and each application of the transformation rules preserves solutions. We now characterise the normal forms, called *solved forms*. Intuitively, solved forms are simple enough that one can easily extract solutions from it. A first example of well-known solved form is that of *unification solved form*: a conjunction of equations *X<sup>i</sup>* = *t<sup>i</sup>* such that each *X<sup>i</sup>* occurs only once. It directly represents a solution mapping *X<sup>i</sup>* → *ti*.

We show in Theorem 4 existence of solutions for certain solved forms, and in Theorem 5 we prove that our procedure is complete with respect to solved forms.

#### **Definition 14 (Solved Forms).**


$$\mathcal{P} = \exists \overline{W} : \left(\bigwedge\_{i=1}^n Z\_i \approx\_\alpha t\_i\right) \land \left(\bigwedge\_{j=1}^m Z'\_j \not\approx\_\alpha v\_j\right) \land \left(\bigwedge\_{l=1}^p C\_l\right),$$

*such that:*


Theorem 4 below shows that a problem reduced to definition with constraints solved form has at least one solution.

**Theorem 4.** *Let* <sup>A</sup> *be any infinite subalgebra of* CORE*. If* P ≡ ⊥ *is in definition with constraints solved form, then it has at least one solution.*

*Proof.* First assume P is in unification solved form (see Definition 14). Let ∇ be the context containing all constraints *C<sup>l</sup>* occurring in P. Furthermore, define the substitution *σ* that assigns to each free variable *X<sup>i</sup>* the term *ti*, and the substitution *<sup>δ</sup>* mapping each existential variable *<sup>W</sup><sup>k</sup>* to *<sup>t</sup>k*. Then -<sup>∇</sup>*σδ<sup>ς</sup>* , which is equivalent to -<sup>∇</sup>*<sup>ς</sup>σδ* by Lemma 1, is valid in <sup>A</sup>. Consequently,

$$\left\|\nabla \vdash X\_i \sigma \approx\_\alpha t\_i \sigma \delta\right\|\_\varsigma \text{ and } \left\|\nabla \vdash W\_k \delta \approx\_\alpha t\_k \delta\right\|\_\varsigma.$$

are valid judgements. So, *σ* is an A-solution of P with existential witnesses given by *δ*. In the general case, when P is in *definition with constraints* solved form containing also negative constraints, the construction is similar. We can guarantee *m*

a solution for the disunification part of the problem, - *j*=1 *Z*- *<sup>j</sup>* ≈*<sup>α</sup> v<sup>j</sup>* , by Lemma 5.

**Definition 15.** *A set* R *of rules for solving nominal equational problems is* complete *w.r.t. a kind of solved forms S if for each* P *there exists a family of* NEP*<sup>s</sup>* <sup>Q</sup>*<sup>i</sup> in <sup>S</sup>-solved form such that* <sup>P</sup> <sup>∗</sup> <sup>=</sup><sup>⇒</sup> <sup>R</sup> <sup>Q</sup>*<sup>i</sup> and* <sup>S</sup>(P) = *<sup>i</sup>* S(Q*<sup>i</sup>*)*.*

The next result states that a NEP's normal form with respect to the simplification rules given in the previous section is a definition with constraints. In particular, all parameters are removed from the problem. The proof is by case analysis, considering all possible occurrences of parameters in a problem.

**Theorem 5 (Completeness).** *Let* <sup>A</sup> *be any infinite subalgebra of* CORE*. Then the rules in Figures 2, 3, and 4 are complete for parameterless solved forms and definition with constraints solved forms.*

#### **6 Conclusion**

In this paper, we introduced *nominal equational problems* (NEPs) as an extension of standard first-order equational problems to nominal terms which, besides equations and disequations, includes freshness and non-freshness constraints. We proposed a sound and preserving rule-based algorithm to solve NEPs in the nominal ground algebra CORE, and showed that this algorithm is complete for two main types of solved forms: parameterless and definition with constraints. As future work, we aim to investigate the purely equational approach to nominal syntax via the formulation of freshness constraints using fixed-point equations with the N-quantifier [21], as well as the solvability of nominal equational problems in more complex algebras.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

#### **Finding Cut-Offs in Leaderless Rendez-Vous Protocols is Easy** *-*

A. R. Balasubramanian(-)<sup>1</sup> , Javier Esparza<sup>1</sup> , Mikhail Raskin<sup>1</sup>

Technische Universit¨at M¨unchen, Munich, Germany bala.ayikudi@tum.de, esparza@in.tum.de, raskin@in.tum.de

**Abstract.** In rendez-vous protocols an arbitrarily large number of indistinguishable finite-state agents interact in pairs. The cut-off problem asks if there exists a number B such that all initial configurations of the protocol with at least B agents in a given initial state can reach a final configuration with all agents in a given final state. In a recent paper [17], Horn and Sangnier prove that the cut-off problem is equivalent to the Petri net reachability problem for protocols with a leader, and in EXPSPACE for leaderless protocols. Further, for the special class of symmetric protocols they reduce these bounds to PSPACE and NP, respectively. The problem of lowering these upper bounds or finding matching lower bounds is left open. We show that the cut-off problem is P-complete for leaderless protocols, NP-complete for symmetric protocols with a leader, and in NC for leaderless symmetric protocols, thereby solving all the problems left open in [17].

**Keywords:** rendez-vous protocols · cut-off problem · Petri nets

#### **1 Introduction**

Distributed systems are often designed for an unbounded number of participant agents. Therefore, they are not just one system, but an infinite family of systems, one for each number of agents. Parameterized verification addresses the problem of checking that all systems in the family satisfy a given specification.

In many application areas, agents are indistinguishable. This is the case in computational biology, where cells or molecules have no identities; in some security applications, where the agents' identities should stay private; or in applications where the identities can be abstracted away, like certain classes of multithreaded programs [15,2,31,3,18,25]. Following [3,18], we use the term *replicated systems* for distributed systems with indistinguishable agents. Replicated systems include population protocols, broadcast protocols, threshold automata, and many other models [15,2,11,7,16]. They also arise after applying a *counter abstraction* [28,3]. In finite-state replicated systems the global state of the system is determined by the function (usually called a *configuration*) that assigns

<sup>-</sup> This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme under grant agreement No 787367 (PaVeS).

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 42–61, 2021.

https://doi.org/10.1007/978-3-030-71995-1 3

to each state the number of agents that currently occupy it. This feature makes many verification problems decidable [4,10].

Surprisingly, there is no a priori relation between the complexity of a parameterized verification question (i.e., whether a given property holds for all initial configurations, or, equivalently, whether its negation holds for some configuration), and the complexity of its corresponding single-instance question (whether the property holds for a fixed initial configuration). Consider replicated systems where agents interact in pairs [15,17,2]. The complexity of single-instance questions is very robust. Indeed, checking most properties, including all properties expressible in LTL and CTL, is PSPACE-complete [9]. On the contrary, the complexity of parameterized questions is very fragile, as exemplified by the following example. While the existence of a reachable configuration that populates a given state with *at least* one agent is in P, and so well below PSPACE, the existence of a reachable configuration that populates a given state with *exactly* one agent is as hard as the reachability problem for Petri nets, and so non-elementary [6]. This fragility makes the analysis of parameterized questions very interesting, but also much harder.

Work on parameterized verification has concentrated on whether every initial configuration satisfies a given property (see e.g. [15,11,3,18,7]). However, applications often lead to questions of the form "do all initial configurations *in a given set* satisfy the property?", "do infinitely many initial configurations satisfy the property?", or "do all but finitely many initial configurations satisfy the property?". An example of the first kind is proving correctness of population protocols, where the specification requires that for a given partition I0, I<sup>1</sup> of the set of initial configurations, and a partition Q0, Q<sup>1</sup> of the set of states, runs starting from I<sup>0</sup> eventually trap all agents within Q0, and similarly for I<sup>1</sup> and Q<sup>1</sup> [12]. An example of the third kind is the existence of *cut-offs*; cut-off properties state the existence of an initial configuration such that for all larger initial configurations some given property holds [8,4]. A systematic study of the complexity of these questions is still out of reach, but first results are appearing. In particular, Horn and Sangnier have recently studied the complexity of the *cut-off problem* for parameterized rendez-vous networks [17]. The problem takes as input a network with one single initial state *init* and one single final state *fin*, and asks whether there exists a cut-off B such that for every number of agents n ≥ B, the final configuration in which all agents are in state *fin* is reachable from the initial configuration in which all agents are in state *init*.

Horn and Sangnier study two versions of the cut-off problem, for leaderless networks and networks with a leader. Intuitively, a leader is a distinguished agent with its own set of states. They show that in the presence of a leader the cut-off problem and the reachability problem for Petri nets problems are inter-reducible, which shows that the cut-off problem is in the Ackermannian complexity class F<sup>ω</sup> [22], and non-elementary [6]. For the leaderless case, they show that the problem is in EXPSPACE. Further, they also consider the special case of symmetric networks, for which they obtain better upper bounds: PSPACE for the case of a


**Horn and Sangnier** Asymmetric rendez-vous Symmetric rendez-vous

**Table 1.** Summary of the results by Horn and Sangnier and the results of this paper.

leader, and NP in the leaderless case. These results are summarized at the top of Table 1.

In [17] the question of improving the upper bounds or finding matching lower bounds is left open. In this paper we close it with a surprising answer: All elementary upper bounds of [17] can be dramatically improved. In particular, our main result shows that the EXPSPACE bound for the leaderless case can be brought down to P. Further, the PSPACE and NP bounds of the symmetric case can be lowered to NP and NC, respectively, as shown at the bottom of Table 1. We also obtain matching lower bounds. Finally, we provide almost tight upper bounds for the size of the cut-off B; more precisely, we show that if B exists, then <sup>B</sup> <sup>∈</sup> <sup>2</sup><sup>n</sup>O(1) for a protocol of size <sup>n</sup>.

Our results follow from two lemmas, called the Scaling and Insertion Lemmas, that connect the *continuous semantics* for Petri nets to their standard semantics. In the continuous semantics of Petri nets transition firings can be scaled by a positive rational factor; for example, a transition can fire with factor 1/3, taking "1/3 of a token" from its input places. The continuous semantics is a relaxation of the standard one, and its associated reachability problem is much simpler (polynomial instead of non-elementary [14,6,5]). The Scaling Lemma<sup>1</sup> states that given two markings M,M of a Petri net, if M is reachable from M in the continuous semantics, then nM is reachable from nM in the standard semantics for some <sup>n</sup> <sup>∈</sup> <sup>2</sup><sup>m</sup>O(1) , where <sup>m</sup> is the total size of the net and the markings. The Insertion Lemma states that, given four markings M,M , L, L , if M is reachable from M in the continuous semantics and the *marking equation* L = L +A**x** has a solution **<sup>x</sup>** <sup>∈</sup> <sup>Z</sup><sup>T</sup> (observe that **<sup>x</sup>** can have negative components), then nM +L is reachable from nM <sup>+</sup> <sup>L</sup> in the standard semantics for some <sup>n</sup> <sup>∈</sup> <sup>2</sup><sup>m</sup>O(1) . We think that these lemmas can be of independent interest.

The paper is organized as follows. Section 2 contains preliminaries; in particular, it defines the cut-off problem for rendez-vous networks and reduces it to the cut-off problem for Petri nets. Section 3 gives a polynomial time algorithm for the leaderless cut-off problem for acyclic Petri nets. Section 4 introduces the Scaling and Insertion Lemmas, and Section 5 presents the novel polynomial

<sup>1</sup> Heavily based on previous results by Fraca and Haddad [14].

time algorithm for the cut-off problem. Sections 6 and 7 present the results for symmetric networks, for the cases with and without leaders, respectively.

Due to lack of space, full proofs of some of the lemmas can be found in the appendix.

### **2 Preliminaries**

**Multisets** Let E be a finite set. For a semi-ring S, a vector from E to S is a function <sup>v</sup> : <sup>E</sup> <sup>→</sup> <sup>S</sup>. The set of all vectors from <sup>E</sup> to <sup>S</sup> will be denoted by <sup>S</sup><sup>E</sup>. In this paper, the semi-rings we will be concerned with are the natural numbers N, the integers <sup>Z</sup> and the non-negative rationals <sup>Q</sup>≥<sup>0</sup> (under the usual addition and multiplication operators). The *support* of a vector <sup>v</sup> is the set <sup>v</sup> := {<sup>e</sup> : <sup>v</sup>(e) <sup>=</sup> 0} and its *size* is the number v = - <sup>e</sup>∈<sup>v</sup> abs(v(e)) where abs(x) denotes the absolute value of x. Vectors from E to N are also called discrete multisets (or just multisets) and vectors from <sup>E</sup> to <sup>Q</sup>≥<sup>0</sup> are called continuous multisets.

Given a multiset M and a number α we let α · M be the multiset given by (α · M)(e) = M(e) · α for all e ∈ E. Given two multisets M and M we say that M ≤ M if M(e) ≤ M (e) for all e ∈ E and we let M + M be the multiset given by (M + M )(e) = M(e) + M (e) and if M ≤ M, we let M − M be the multiset given by (M − M )(e) = M(e) − M (e). The empty multiset is denoted by **<sup>0</sup>**. We sometimes denote multisets using a set-like notation, e.g. a, <sup>2</sup> · b, c denotes the multiset given by M(a)=1, M(b)=2, M(c) = 1 and M(e) = 0 for all e /∈ {a, b, c}.

Given an I × J matrix A with I and J sets of indices, I ⊆ I and J ⊆ J, we let A<sup>I</sup>-×J denote the restriction of M to rows indexed by I and columns indexed by J .

**Rendez-vous protocols and the cut-off problem.** Let Σ be a fixed finite set which we will call the communication alphabet and we let RV (Σ) = {!a, ?a : a ∈ Σ}. The symbol !a denotes that the message a is sent and ?a denotes that the message a is received.

**Definition 1.** *A rendez-vous protocol* P *is a tuple* (Q, Σ, *init*, *fin*, R) *where* Q *is a finite set of states,* Σ *is the communication alphabet, init*, *fin* ∈ Q *are the initial and final states respectively and* R ⊆ Q × RV (Σ) × Q *is the set of rules.*

The size |P| of a protocol is defined as the number of bits needed to encode P in {0, 1}<sup>∗</sup> using some standard encoding. A configuration C of P is a multiset of states, where C(q) should be interpreted as the number of agents in state q. We use C(P) to denote the set of all configurations of P. An initial (final) configuration C is a configuration such that C(q) = 0 if q = *init* (resp. C(q)=0 if <sup>q</sup> <sup>=</sup> *fin*). We use <sup>C</sup><sup>n</sup> init (C<sup>n</sup> fin) to denote the initial (resp. final) configuration such that C<sup>n</sup> init(*init*) = n (resp. C<sup>n</sup> fin(*fin*) = n).

The operational semantics of a rendez-vous protocol P is given by means of a transition system between the configurations of P. We say that there is a transition between C and C , denoted by C ⇒ C iff there exists a ∈ Σ, p, q, p , q ∈ Q such that (p, !a, p ),(q, ?a, q ) <sup>∈</sup> <sup>R</sup>, <sup>C</sup> <sup>≥</sup> p, q and <sup>C</sup> <sup>=</sup> <sup>C</sup> <sup>−</sup> p, q <sup>+</sup> p , q . As usual, <sup>∗</sup> =⇒ denotes the reflexive and transitive closure of ⇒. The cut-off problem for rendez-vous protocols, as defined in [17], is:

*Given:* A rendez-vous protocol P

*Decide:* Is there <sup>B</sup> <sup>∈</sup> <sup>N</sup> such that <sup>C</sup><sup>n</sup> init ∗ <sup>=</sup><sup>⇒</sup> <sup>C</sup><sup>n</sup> fin for every n ≥ B ?

If such a B exists then we say that P admits a cut-off and that B is a cut-off for P.

**Petri nets.** Rendez-vous protocols can be seen as a special class of Petri nets.

**Definition 2.** *A Petri net is a tuple* N = (P, T,*Pre*,*Post*) *where* P *is a finite set of places,* T *is a finite set of transitions, Pre and Post are matrices whose rows and columns are indexed by* P *and* T *respectively and whose entries belong to* <sup>N</sup>*. The incidence matrix* <sup>A</sup> *of* <sup>N</sup> *is defined to be the* <sup>P</sup> <sup>×</sup> <sup>T</sup> *matrix given by* A = *Post* −*Pre. Further by the* weight *of* N *, we mean the largest absolute value appearing in the matrices Pre and Post .*

The size |N | of N is defined as the number of bits needed to encode N in {0, 1}<sup>∗</sup> using some suitable encoding. For a transition t ∈ T we let • t = {p : *Pre*[p, t] > 0} and t • = {p : *Post*[p, t] > 0}. We extend this notation to set of transitions in the obvious way. Given a Petri net N , we can associate with it a graph where the vertices are P ∪ T and the edges are {(p, t) : p ∈ • t}∪{(t, p) : p ∈ t • }. A Petri net N is called acyclic if its associated graph is acyclic.

<sup>A</sup> *marking* of a Petri net is a multiset <sup>M</sup> <sup>∈</sup> <sup>N</sup><sup>P</sup> , which intuitively denotes the number of *tokens* that are present in every place of the net. For t ∈ T and markings M and M , we say that M is reached from M by firing t, denoted <sup>M</sup> <sup>t</sup> −→ M , if for every place p, M(p) ≥ *Pre*[p, t] and M (p) = M(p) + A[p, t].

A *firing sequence* is any sequence of transitions σ = t1, t2,...,t<sup>k</sup> ∈ T <sup>∗</sup>. The support of <sup>σ</sup>, denoted by <sup>σ</sup>, is the set of all transitions which appear in <sup>σ</sup>. We let σσ denote the concatenation of two sequences σ, σ .

Given a firing sequence <sup>σ</sup> <sup>=</sup> <sup>t</sup>1, t2,...,t<sup>k</sup> <sup>∈</sup> <sup>T</sup> <sup>∗</sup>, we let <sup>M</sup> <sup>σ</sup> −→ M denote that there exist <sup>M</sup>1,...,M<sup>k</sup>−<sup>1</sup> such that <sup>M</sup> <sup>t</sup><sup>1</sup> −→ <sup>M</sup><sup>1</sup> t2 −→ M<sup>2</sup> ...M<sup>k</sup>−<sup>1</sup> tk −→ M . Further, <sup>M</sup> <sup>→</sup> <sup>M</sup> denotes that there exists <sup>t</sup> <sup>∈</sup> <sup>T</sup> such that <sup>M</sup> <sup>t</sup> −→ M , and M <sup>∗</sup> −→ M denotes that there exists <sup>σ</sup> <sup>∈</sup> <sup>T</sup> <sup>∗</sup> such that <sup>M</sup> <sup>σ</sup> −→ M .

*Marking equation of a Petri net system.* In the following, a *Petri net system* is a triple (N ,M,M ) where N is a Petri net and M = M are markings. The *marking equation* for (N ,M,M ) is the equation

$$M' = M + \mathcal{A}\mathbf{v}$$

over the variables **<sup>v</sup>**. It is well known that <sup>M</sup> <sup>σ</sup> −→ <sup>M</sup> implies <sup>M</sup> <sup>=</sup> <sup>M</sup> <sup>+</sup> <sup>A</sup>−→<sup>σ</sup> , where −→<sup>σ</sup> <sup>∈</sup> <sup>N</sup><sup>T</sup> is the the *Parikh image* of <sup>σ</sup>, defined as the vector whose component −→σ [t] for transition t is equal to the number of times t appears in σ. Therefore, if <sup>M</sup> <sup>σ</sup> −→ <sup>M</sup> then −→<sup>σ</sup> is a nonnegative integer solution of the marking equation. The converse does not hold.

**From rendez-vous protocols to Petri nets.** Let P = (Q, Σ, *init*, *fin*, R) be a rendez-vous protocol. Create a Petri net N<sup>P</sup> = (P, T,*Pre*,*Post*) as follows. The set of places is Q. For each letter a ∈ Σ and for each pair of rules r = (q, !a, s), r = (q , ?a, s ) ∈ R, add a transition tr,rto N<sup>P</sup> and set

**–** *Pre*[p, t] = 0 for every p /∈ {q, q }, *Post*[p, t] = 0 for every p /∈ {s, s } **–** If q = q then *Pre*[q, t] = −2, otherwise *Pre*[q, t] = *Pre*[q , t] = −1

**–** If s = s then *Post*[s, t] = 2, otherwise *Post*[s, t] = *Post*[s , t] = 1.

It is clear that any configuration of a protocol P is also a marking of N<sup>P</sup> , and vice versa. Further, the following proposition is obvious.

**Proposition 1.** *For any two configurations* C *and* C *we have that* C <sup>∗</sup> =⇒ C *over the protocol* <sup>P</sup> *iff* <sup>C</sup> <sup>∗</sup> −→ C *over the Petri net* N<sup>P</sup> *.*

Consequently, the cut-off problem for Petri nets, defined by

*Given :* A Petri net system (N ,M,M )

*Decide:* Is there <sup>B</sup> <sup>∈</sup> <sup>N</sup> such that <sup>n</sup> · <sup>M</sup> <sup>∗</sup> −→ n · M for every n ≥ B ?

generalizes the problem for rendez-vous protocols.

#### **3 The cut-off problem for acyclic Petri nets**

We show that the cut-off problem for acyclic Petri nets can be solved in polynomial time. The reason for considering this special case first is that it illustrates one of the main ideas of the general case in a very pure form.

Let us fix a Petri net system (N ,M,M ) for the rest of this section, where N = (P, T, P re, P ost) is acyclic and A is its incidence matrix. It is well-known that in acyclic Petri nets the reachability relation is characterized by the marking equation (see e.g. [24]):

**Proposition 2 ([24]).** *Let* (N ,M,M ) *be an acyclic Petri net system. For every sequence* <sup>σ</sup> <sup>∈</sup> <sup>T</sup> <sup>∗</sup>*, we have* <sup>M</sup> <sup>σ</sup> −→ <sup>M</sup> *iff* −→<sup>σ</sup> *is a solution of the marking equation. Consequently,* M <sup>∗</sup> −→ M *iff the marking equation has a nonnegative integer solution.*

This proposition shows that the reachability problem for acyclic Petri nets reduces to the feasibilty problem (i.e., existence of solutions) of systems of linear diophantine equations over the nonnegative integers. So the reachability problem for acyclic Petri nets is in NP, and in fact both the reachability and the feasibility problems are NP-complete [13].

There are two ways to relax the conditions on the solution so as to make the feasibility problem polynomial. Feasibility over the nonnegative *rationals* and feasibility over all integers are both in P. The first is due to the polynomiality of linear programming. For the second, feasibility can be decided in polynomial time after computing the Smith or Hermite normal forms (see e.g. [29]), which can themselves be computed in polynomial time [19]. We show that the cut-off problem can be reduced to these two relaxed problems.

#### **3.1 Characterizing acyclic systems with cut-offs**

Horn and Sangnier proved in [17] a very useful charaterization of the rendezvous protocols with a cut-off: A rendez-vous protocol P admits a cut-off iff there exists <sup>n</sup> <sup>∈</sup> <sup>N</sup> such that <sup>C</sup><sup>n</sup> init ∗ <sup>=</sup><sup>⇒</sup> <sup>C</sup><sup>n</sup> fin and C<sup>n</sup>+1 init ∗ <sup>=</sup><sup>⇒</sup> <sup>C</sup><sup>n</sup>+1 fin . The proof immediately generalizes to the case of Petri nets:

**Lemma 1 ([17]).** *A Petri net system* (N ,M,M ) *(acyclic or not) admits a cutoff iff there exists* <sup>n</sup> <sup>∈</sup> <sup>N</sup> *such that* <sup>n</sup>·<sup>M</sup> <sup>∗</sup> −→ <sup>n</sup>·M *and* (n+ 1)·<sup>M</sup> <sup>∗</sup> −→ (n+ 1)·M *. Moreover if* <sup>n</sup> · <sup>M</sup> <sup>∗</sup> −→ <sup>n</sup> · <sup>M</sup> *and* (<sup>n</sup> + 1) · <sup>M</sup> <sup>∗</sup> −→ (n + 1) · M *, then* n<sup>2</sup> *is a cut-off for the system.*

Using this lemma, we characterize those acyclic Petri net systems which admit a cut-off.

**Theorem 1.** *An acyclic Petri net system* (N ,M,M ) *admits a cut-off iff the marking equation has solutions* **<sup>x</sup>** <sup>∈</sup> <sup>Q</sup><sup>T</sup> <sup>≥</sup><sup>0</sup> *and* **<sup>y</sup>** <sup>∈</sup> <sup>Z</sup><sup>T</sup> *such that* **<sup>y</sup>** <sup>⊆</sup> **x***.*

*Proof.* (⇒): Suppose (N ,M,M ) admits a cut-off. Hence there exists <sup>b</sup> <sup>∈</sup> <sup>N</sup> such that for all <sup>n</sup> <sup>≥</sup> <sup>b</sup> we have nM <sup>∗</sup> −→ nM . Let bM <sup>σ</sup>- −→ bM and (<sup>b</sup> + 1)<sup>M</sup> <sup>τ</sup>- −→ (b+1)M . Then, notice that (2b+1)<sup>M</sup> <sup>σ</sup>- τ- −−−→ (2b+1)M and (2b+2)<sup>M</sup> <sup>τ</sup>- τ- −−→ (2b+ 2)M . Hence, if we let n = 2b + 1, σ = σ τ and τ = τ <sup>τ</sup> we have, nM <sup>σ</sup> −→ nM , (<sup>n</sup> + 1)<sup>M</sup> <sup>τ</sup> −→ (<sup>n</sup> + 1)M and <sup>τ</sup> <sup>⊆</sup> <sup>σ</sup>. By Proposition 2, there exist **<sup>x</sup>** , **<sup>y</sup>** <sup>∈</sup> <sup>N</sup><sup>T</sup> such that **y** ⊆ **x** , nM <sup>=</sup> nM <sup>+</sup> <sup>A</sup>**x** and (<sup>n</sup> + 1)M = (<sup>n</sup> + 1)<sup>M</sup> <sup>+</sup> <sup>A</sup>**y** . Letting **x** = **x** /n and **y** = **y** − **x** , we get our required vectors.

(⇐): Suppose **<sup>x</sup>** <sup>∈</sup> <sup>Q</sup><sup>T</sup> <sup>≥</sup><sup>0</sup> and **<sup>y</sup>** <sup>∈</sup> <sup>Z</sup><sup>T</sup> are solutions of the marking equation such that **<sup>y</sup>** <sup>⊆</sup> **<sup>x</sup>**. Let <sup>μ</sup> be the least common multiple of the denominators of the components of **x**, and let α be the largest absolute value of the numbers in the vector **<sup>y</sup>**. By definition of <sup>μ</sup> we have <sup>α</sup>(μ**x**) <sup>∈</sup> <sup>N</sup><sup>T</sup> . Also, since **<sup>y</sup>** <sup>⊆</sup> **<sup>x</sup>** it follows by definition of <sup>α</sup> that <sup>α</sup>(μ**x**) + **<sup>y</sup>** <sup>≥</sup> **<sup>0</sup>** and hence <sup>α</sup>(μ**x**) + **<sup>y</sup>** <sup>∈</sup> <sup>N</sup><sup>T</sup> . Since M = M + A**x** and M = M + A**y** we get

αμM = αμM + A(αμ**x**) and (αμ + 1)M = (αμ + 1)M + A(αμ**x** + **y**)

Taking αμ = n, by Proposition 2 we get that nM <sup>∗</sup> −→ nM and (<sup>n</sup> + 1)<sup>M</sup> <sup>∗</sup> −→ (n + 1)M . By Lemma 1, (N ,M,M ) admits a cut-off.

Intuitively, the existence of the rational solution **<sup>x</sup>** <sup>∈</sup> <sup>Q</sup><sup>T</sup> <sup>≥</sup><sup>0</sup> guarantees nM <sup>∗</sup> −→ nM for infinitely many <sup>n</sup>, and the existence of the integer solution **<sup>y</sup>** <sup>∈</sup> <sup>Z</sup><sup>T</sup> guarantees that for one of those n we have (n + 1)M <sup>∗</sup> −→ (n + 1)M as well.

*Example 1.* The net system given by the net on Figure 1 along with the markings <sup>M</sup> <sup>=</sup> i and <sup>M</sup> <sup>=</sup> f admits a cut-off. The conditions of the theorem are satisfied by **x** = ( <sup>1</sup> <sup>5</sup> , <sup>1</sup> <sup>5</sup> , <sup>1</sup> <sup>5</sup> , <sup>1</sup> <sup>5</sup> ) and **y** = (−1, 1, 1, 1).

**Fig. 1.** A net with cut-off 2.

#### **3.2 Polynomial time algorithm**

We derive a polynomial time algorithm for the cut-off problem from the characterization of Theorem 1. The first step is the following lemma. A very similar lemma is proved in [14], but since the proof is short we give it for the sake of completeness:

**Lemma 2.** *If the marking equation is feasible over* <sup>Q</sup>≥<sup>0</sup>*, then it has a solution with maximum support. Moreover, such a solution can be found in polynomial time.*

*Proof.* If **<sup>y</sup>**, **<sup>z</sup>** <sup>∈</sup> <sup>Q</sup><sup>T</sup> <sup>≥</sup><sup>0</sup> are solutions of the marking equation, then we have <sup>M</sup> <sup>=</sup> <sup>M</sup> <sup>+</sup> <sup>A</sup>((**<sup>y</sup>** <sup>+</sup> **<sup>z</sup>**)/2) and **<sup>y</sup>** <sup>∪</sup> **<sup>z</sup>** <sup>⊆</sup> -(**<sup>y</sup>** <sup>+</sup> **<sup>z</sup>**)/2. Hence if the marking equation if feasible over <sup>Q</sup>≥<sup>0</sup>, then it has a solution with maximum support.

To find such a solution in polynomial time we proceed as follows. For every transition t we solve the linear program M = M + A**v**, **v** ≥ **0**, **v**(t) > 0. (Recall that solving linear programs over the rationals can be done in polynomial time). Let {t1,...,t<sup>n</sup>} be the set of transitions whose associated linear programs are feasible over Q<sup>T</sup> <sup>≥</sup><sup>0</sup>, and let {**u**1,..., **<sup>u</sup>**<sup>n</sup>} be solutions to these programs. Then 1/n · n <sup>i</sup>=1 **u**<sup>i</sup> is a solution of the marking equation with maximum support.

We now have all the ingredients to give a polynomial time algorithm.

**Theorem 2.** *The cut-off problem for acyclic net systems can be solved in polynomial time.*

*Proof.* First, we check that the marking equation has a solution over the nonnegative rationals. If such a solution does not exist, by Theorem 1 the given net system does not admit a cut-off.

Suppose such a solution exists. By Lemma 2 we can find a non-negative rational solution **x** with maximum support in polynomial time. Let U contain all the transitions t such that **x**<sup>t</sup> = 0. We now check in polynomial time if the marking equation has a solution **<sup>y</sup>** over <sup>Z</sup><sup>T</sup> such that **<sup>y</sup>**<sup>t</sup> = 0 for every <sup>t</sup> <sup>∈</sup> <sup>U</sup>. By Theorem 1 such a solution exists iff the net system admits a cut-off.

The rendez-vous protocol given in Figure 2, which was stated in [17], is an example of a protocol where the smallest cut-off is exponential in the size of the protocol. In the next sections, we will actually prove that if a net system N (acyclic or not) admits a cut-off, then there is one with a polynomial number of bits in |N |.

**Fig. 2.** Example of a protocol with an exponential cut-off

#### **4 The Scaling and Insertion lemmas**

Similar to the case of acyclic net systems, we would like to provide a characterization of net systems admitting a cut-off and then use this characterization to derive a polynomial time algorithm. Unfortunately, in general net systems there is no characterization of reachability akin to Proposition 2 for acyclic systems. To this end, we prove two intermediate lemmas to help us come up with a characterization for cut-off admissible net systems in the general case. We believe that these two lemmas could be of independent interest in their own right. Further, the proofs of both lemmas are provided so that it will enable us later on to derive a bound on the cut-off for net systems.

#### **4.1 The Scaling Lemma**

The Scaling Lemma shows that, given a Petri net system (N ,M,M ), whether nM <sup>∗</sup> −→ nM holds for some n ≥ 1 can be decided in polynomial time; moreover, if nM <sup>∗</sup> −→ nM holds for some n, then it holds for some n with at most (|N |(log M + log M ))<sup>O</sup>(1) bits. The name of the lemma is due to the fact that the firing sequence leading from nM to nM is obtained by *scaling up* a *continuous firing sequence* from M to M ; the existence of such a continuous sequence can be decided in polynomial time [14].

In the rest of the section we first recall continuous Petri nets and the charaterization of [14], and then present the Scaling Lemma<sup>2</sup>.

<sup>2</sup> The lemma is implicitly proved in [14], but the bound on the size of n is hidden in the details of the proof, and we make it explicit.

**Reachability in continuous Petri nets.** Petri nets can be given a *continuous semantics* (see e.g. [1,30,14]), in which markings are continuous multisets; we call them *continuous markings*. A continuous marking M enables a transition t *with factor* <sup>λ</sup> <sup>∈</sup> <sup>Q</sup>≥<sup>0</sup> if <sup>M</sup>(p) <sup>≥</sup> <sup>λ</sup> · Pre[p, t] for every place <sup>p</sup>; we also say that <sup>M</sup> enables λt. If M enables λt, then λt can fire or occur, leading to a new marking M given by M (p) = M(p) + λ · A[p, t] for every p ∈ P. We denote this by <sup>M</sup> λt −→<sup>Q</sup> M , and say that M is reached from M by firing λt. A *continuous firing sequence* is any sequence of transitions <sup>σ</sup> <sup>=</sup> <sup>λ</sup>1t1, λ2t2,...,λkt<sup>k</sup> <sup>∈</sup> (Q≥<sup>0</sup> <sup>×</sup> <sup>T</sup>)∗. We let <sup>M</sup> <sup>σ</sup> −→<sup>Q</sup> M denote that there exist continuous markings M1,...,M<sup>k</sup>−<sup>1</sup> such that <sup>M</sup> <sup>λ</sup>1t<sup>1</sup> −−−<sup>Q</sup> → M<sup>1</sup> λ2t<sup>2</sup> −−−<sup>Q</sup> → M<sup>2</sup> ··· M<sup>k</sup>−<sup>1</sup> λkt<sup>k</sup> −−−Q→ M . Further, M <sup>∗</sup> −→<sup>Q</sup> M denotes that <sup>M</sup> <sup>σ</sup> −→<sup>Q</sup> M holds for some continuous firing sequence σ.

The *Parikh image* of <sup>σ</sup> <sup>=</sup> <sup>λ</sup>1t1, λ2t2,...,λkt<sup>k</sup> <sup>∈</sup> (Q≥<sup>0</sup> <sup>×</sup> <sup>T</sup>)<sup>∗</sup> is the vector −→<sup>σ</sup> <sup>∈</sup> <sup>Q</sup><sup>T</sup> <sup>≥</sup><sup>0</sup> where −→<sup>σ</sup> [t] = k <sup>i</sup>=1 δi,tλi, where δi,t = 1 if t<sup>i</sup> = t and 0 otherwise. The support of <sup>σ</sup> is the support of its Parikh image −→<sup>σ</sup> . If <sup>M</sup> <sup>σ</sup> −→<sup>Q</sup> M then −→σ is a solution of the marking equation over Q<sup>T</sup> <sup>≥</sup><sup>0</sup>, but the converse does not hold. In [14], Fraca and Haddad strengthen this necessary condition to make it also sufficient, and use the resulting characterization to derive a polynomial algorithm.

**Theorem 3 ([14]).** *Let* (N ,M,M ) *be a Petri net system.*


**Scaling.** It follows easily from the definitions that nM <sup>∗</sup> −→ nM holds for some <sup>n</sup> <sup>≥</sup> 1 iff <sup>M</sup> <sup>∗</sup> −→<sup>Q</sup> M . Indeed, if <sup>M</sup> <sup>σ</sup> −→<sup>Q</sup> M for some σ = λ1t1, λ2t2,...,λkt<sup>k</sup> ∈ (Q≥<sup>0</sup> <sup>×</sup> <sup>T</sup>)∗, then we can scale this continuous firing sequence to a discrete sequence nM nσ −−→<sup>Q</sup> nM where <sup>n</sup> is the smallest number such that nλ1, . . . , nλ<sup>k</sup> <sup>∈</sup> <sup>N</sup>, and nσ = t nλ<sup>1</sup> <sup>1</sup> t nλ<sup>2</sup> <sup>2</sup> ...tnλ<sup>k</sup> <sup>k</sup> . So Theorem 3 immediately implies that the existence of <sup>n</sup> <sup>≥</sup> 1 such that nM <sup>∗</sup> −→ nM can be decided in polynomial time. The following lemma also gives a bound on n.

**Lemma 3.** *Let* (N ,M,M ) *be a Petri net system with weight w such that* <sup>M</sup> <sup>σ</sup> −→<sup>Q</sup> <sup>M</sup> *for some continuous firing sequence* <sup>σ</sup> <sup>∈</sup> (Q≥<sup>0</sup> <sup>×</sup>T)∗*. Let* <sup>m</sup> *be the number of transitions in* <sup>σ</sup> *and let be* −→<sup>σ</sup> *. Let* <sup>k</sup> *be the smallest natural number such that* <sup>k</sup>−→<sup>σ</sup> <sup>∈</sup> <sup>N</sup><sup>T</sup> *. Then, there exists a firing sequence* <sup>τ</sup> <sup>∈</sup> <sup>T</sup> <sup>∗</sup> *such that* <sup>τ</sup> <sup>=</sup> σ *and*

$$\left(16w(w+1)^{2m}k\ell \cdot M\right) \xrightarrow{\tau} \left(16w(w+1)^{2m}k\ell \cdot M'\right)$$

**Lemma 4. (Scaling Lemma).** *Let* (N ,M,M ) *be a Petri net system such that* <sup>M</sup> <sup>σ</sup> −→<sup>Q</sup> M *. There exists a number* n *with a polynomial number of bits in* |N |(log M + log M ) *such that* nM <sup>τ</sup> −→ nM *for some* <sup>τ</sup> *with* <sup>τ</sup> <sup>=</sup> σ*.*

#### **4.2 The Insertion Lemma**

In the acyclic case, the existence of a cut-off is characterized by the existence of solutions to the marking equation Q<sup>T</sup> <sup>≥</sup><sup>0</sup> and <sup>Z</sup><sup>T</sup> . Intuitively, in the general case we replace the existence of solutions over Q<sup>T</sup> <sup>≥</sup><sup>0</sup> by the conditions of the Scaling Lemma, and the existence of solutions over Z<sup>T</sup> by the Insertion Lemma:

**Lemma 5 (Insertion Lemma).** *Let* M,M , L, L *be markings of* N *satisfying* <sup>M</sup> <sup>σ</sup> −→ <sup>M</sup> *for some* <sup>σ</sup> <sup>∈</sup> <sup>T</sup> <sup>∗</sup> *and* <sup>L</sup> <sup>=</sup> <sup>L</sup> <sup>+</sup> <sup>A</sup>**<sup>y</sup>** *for some* **<sup>y</sup>** <sup>∈</sup> <sup>Z</sup><sup>T</sup> *such that* **<sup>y</sup>** <sup>⊆</sup> <sup>σ</sup>*. Then* μM <sup>+</sup> <sup>L</sup> <sup>∗</sup> −→ μM <sup>+</sup> <sup>L</sup> *for* <sup>μ</sup> <sup>=</sup> **y**(−→<sup>σ</sup> n*<sup>w</sup>* <sup>+</sup> <sup>n</sup>*<sup>w</sup>* + 1) *, where w is the weight of* N *, and* n *is the number of places in* • σ*.*

The idea of the proof is a follows: In a first stage, we asynchronously execute multiple "copies" of the firing sequence σ from multiple "copies" of the marking M, until we reach a marking at which all places of • <sup>σ</sup> contain a sufficiently large number of tokens. At this point we temporarily interrupt the executions of the copies of <sup>σ</sup> to *insert* a firing sequence with Parikh mapping **<sup>y</sup>**−→<sup>σ</sup> <sup>+</sup> **<sup>y</sup>**. The net effect of this sequence is to transfer some copies of M to M , leaving the other copies untouched, and exactly one copy of L to L . In the third stage, we resume the interrupted executions of the copies of σ, which completes the transfer of the remaining copies of M to M .

*Proof.* Let **<sup>x</sup>** be the Parikh image of <sup>σ</sup>, i.e., **<sup>x</sup>** <sup>=</sup> −→<sup>σ</sup> . Since <sup>M</sup> <sup>σ</sup> −→ M , by the marking equation we have M = M + A**x**

**First stage:** Let λ<sup>x</sup> = x, λ<sup>y</sup> = y and μ = λy(λxn*w* + n*w* + 1). Let σ := r1, r2,...,r<sup>k</sup> and let M =: M<sup>0</sup> r1 −→ M<sup>1</sup> r2 −→ M<sup>2</sup> ...M<sup>k</sup>−<sup>1</sup> rk −→ M<sup>k</sup> := M. Notice that for each place p ∈ • <sup>σ</sup>, there exists a marking <sup>M</sup><sup>i</sup><sup>p</sup> ∈ {M0,...,M<sup>k</sup>−<sup>1</sup>} such that M<sup>i</sup><sup>p</sup> (p) > 0.

Since each of the markings in {M<sup>i</sup><sup>p</sup> }<sup>p</sup>∈•<sup>σ</sup> can be obtained from M by firing a (suitable) prefix of σ, it is easy to see that from the marking μM + L = - λyM + L + (λxλyn*w* + λyn*w*)M we can reach the marking First := λyM + L + <sup>p</sup>∈•<sup>σ</sup>(λxλy*w* + λy*w*)M<sup>i</sup><sup>p</sup> . This completes our first stage.

**Second stage - Insert:** Since **<sup>y</sup>** <sup>⊆</sup> <sup>σ</sup>, if **<sup>y</sup>**(t) = 0 then **<sup>x</sup>**(t) = 0. Since **x**(t) ≥ 0 for every transition, it now follows that (λy**x** + **y**)(t) ≥ 0 for every transition <sup>t</sup> and (λy**<sup>x</sup>** <sup>+</sup> **<sup>y</sup>**)(t) <sup>&</sup>gt; 0 precisely for those transitions in σ.

Let ξ be any firing sequence such that −→ξ = λy**x** + **y**. Notice that for every place p ∈ • <sup>σ</sup>, First(p) <sup>≥</sup> <sup>λ</sup>xλy*<sup>w</sup>* <sup>+</sup>λy*<sup>w</sup>* ≥ (λy**x**+**y**)·*w*. By an easy induction on ξ, it follows that that First <sup>ξ</sup> −→ Second for some marking Second. By the marking equation, it follows that Second = λyM + L + - <sup>p</sup>∈•<sup>σ</sup>(λxλy*w* + λy*w*)M<sup>i</sup><sup>p</sup> . This completes our second stage.

**Third stage:** Notice that for each place p ∈ • <sup>σ</sup>, by construction of <sup>M</sup><sup>i</sup><sup>p</sup> , there is a firing sequence which takes the marking M<sup>i</sup><sup>p</sup> to the marking M . It then follows that there is a firing sequence which takes the marking Second to the marking λyM + L + - <sup>p</sup>∈•<sup>σ</sup>(λxλy*w* + λy*w*)M = μM + L . This completes our third stage and also completes the desired firing sequence from μM + L to μM + L .

#### **5 Polynomial time algorithm for the general case**

Let (N ,M,M ) be a net system with N = (P, T, P re, P ost), such that A is its incidence matrix. As in Section 3, we first characterize the Petri net systems that admit a cut-off, and then provide a polynomial time algorithm.

#### **5.1 Characterizing systems with cut-offs**

We generalize the characterization of Theorem 1 for acyclic Petri net systems to general systems.

**Theorem 4.** *A Petri net system* (N ,M,M ) *admits a cut-off iff there exists some rational firing sequence* <sup>σ</sup> *such that* <sup>M</sup> <sup>σ</sup> −→<sup>Q</sup> M *and the marking equation has a solution* **<sup>y</sup>** <sup>∈</sup> <sup>Z</sup><sup>T</sup> *such that* **<sup>y</sup>** <sup>⊆</sup> σ*.*

*Proof.* (⇒): Assume (N ,M,M ) admits a cut-off. Hence there exists <sup>B</sup> <sup>∈</sup> <sup>N</sup> such that for all <sup>n</sup> <sup>≥</sup> <sup>B</sup> we have nM <sup>∗</sup> −→ nM . Similar to the proof of theorem 1, we can show that there exist <sup>n</sup> <sup>∈</sup> <sup>N</sup> and firing sequences τ,τ such that nM <sup>τ</sup> −→ nM , (<sup>n</sup> + 1)<sup>M</sup> <sup>τ</sup>- −→ (<sup>n</sup> + 1)M and τ ⊆ τ .

Let τ = t1t<sup>2</sup> ···tk. Construct the rational firing sequence σ := t1/n t2/n ··· <sup>t</sup>k/n. From the fact that nM <sup>τ</sup> −→ nM , we can easily conclude by induction on k that <sup>M</sup> <sup>σ</sup> −→<sup>Q</sup> M . Further, by the marking equation we have nM <sup>=</sup> nM+A−→<sup>τ</sup> and (n+ 1)M = (n+ 1)M +A −→ τ . Let **y** = −→ <sup>τ</sup> <sup>−</sup>−→<sup>τ</sup> . Then **<sup>y</sup>** <sup>∈</sup> <sup>Z</sup><sup>T</sup> and <sup>M</sup> <sup>=</sup> <sup>M</sup> <sup>+</sup>A**y**. Further, since τ ⊆ <sup>τ</sup> <sup>=</sup> <sup>σ</sup>, we have **<sup>y</sup>** <sup>⊆</sup> σ.

(⇐): Assume there exists a rational firing sequence <sup>σ</sup> and a vector **<sup>y</sup>** <sup>∈</sup> <sup>Z</sup><sup>T</sup> such that **<sup>y</sup>** <sup>⊆</sup> <sup>σ</sup>, <sup>M</sup> <sup>σ</sup> −→<sup>Q</sup> M and M = M +A**y**. Let s = |N |(log M+log M ). It is well known that if a system of linear equations over the integers is feasible, then there is a solution which can be described using a number of bits which is polynomial in the size of the input (see e.g. [20]). Hence, we can assume that **y** can be described using <sup>s</sup><sup>O</sup>(1) bits.

By Lemma 4 there exists n (which can be described using s<sup>O</sup>(1) bits) and a firing sequence <sup>τ</sup> with <sup>τ</sup> <sup>=</sup> <sup>σ</sup> such that nM <sup>τ</sup> −→ nM . Hence knM <sup>∗</sup> −→ knM is also possible for any <sup>k</sup> <sup>∈</sup> <sup>N</sup>. By Lemma 5, there exists <sup>μ</sup> (which can once again be described using s<sup>O</sup>(1) bits) such that μnM + M <sup>∗</sup> −→ μnM + M is possible. By Lemma 1 the system (N ,M,M ) admits a cut-off with a polynomial number of bits in s.

Notice that we have actually proved that if a net system admits a cut-off then it admits a cut-off with a polynomial number of bits in its size. Since the cut-off problem for a rendez-vous protocol P can be reduced to a cut-off problem for the Petri net system (N<sup>P</sup> , *init*, *fin*), it follows that,

**Corollary 1.** *If the system* (N ,M,M ) *admits a cut-off then it admits a cutoff with a polynomial number of bits in* |N |(log M + log M )*. Hence, if a rendez-vous protocol* P *admits a cut-off then it admits a cut-off with a polynomial number of bits in* |P|*.*

#### **5.2 Polynomial time algorithm**

We use the characterization given in the previous section to provide a polynomial time algorithm for the cut-off problem. The following lemma, which was proved in [14] and whose proof is given in the appendix, enables us to find a firing sequence between two markings with maximum support.

**Lemma 6.** *[14] Among all the rational firing sequences* <sup>σ</sup> *such that* <sup>M</sup> <sup>σ</sup> −→<sup>Q</sup> M *, there is one with maximum support. Moreover, the support of such a firing sequence can be found in polynomial time.*

We now have all the ingredients to prove the existence of a polynomial time algorithm.

**Theorem 5.** *The cut-off problem for net systems can be solved in polynomial time.*

*Proof.* First, we check that there is a rational firing sequence <sup>σ</sup> with <sup>M</sup> <sup>σ</sup> −→<sup>Q</sup> M , which can be done in polynomial time by ([14], Proposition 27). If such a sequence does not exist, by Theorem 4 the given net system does not admit a cut-off.

Suppose such a sequence exists. By Lemma 6 we can find in polynomial time, the maximum support <sup>S</sup> of all the firing sequences <sup>τ</sup> such that <sup>M</sup> <sup>τ</sup> −→<sup>Q</sup> M . We now check in polynomial time if the marking equation has a solution **y** over Z<sup>T</sup> such that **y**(t) = 0 for every t /∈ S. By Theorem 4 such a solution exists iff the net system admits a cut-off.

This immediately proves that the cut-off problem for rendez-vous protocols is also in polynomial time. By an easy logspace reduction from the Circuit Value Problem [21], we prove that

**Lemma 7.** *The cut-off problem for rendez-vous protocols is P-hard.*

Clearly, this also proves that the cut-off problem for Petri nets is P-hard.

#### **6 Symmetric rendez-vous protocols**

In [17] Horn and Sangnier introduce symmetric rendez-vous protocols, where sending and receiving a message at each state has the same effect, and show that the cut-off problem is in NP. We improve on their result and shown that it is in NC.

Recall that NC is the set of problems in P that can be solved in polylogarithmic *parallel* time, i.e., problems which can be solved by a uniform family of circuits with polylogarithmic depth and polynomial number of gates. Two wellknown problems which lie in NC are graph reachability and feasibility of linear equations over the finite field F<sup>2</sup> of size 2 [27,23]. We proceed to formally define symmetric protocols and state our results.

**Definition 3.** *A rendez-vous protocol* P = (Q, Σ, *init*, *fin*, R) *is* symmetric*, iff its set of rules is symmetric under swapping* !a *and* ?a *for each* a ∈ Σ*, i.e., for each* a ∈ Σ*, we have* (q, !a, q ) ∈ R *iff* (q, ?a, q ) ∈ R*.*

Horn and Sangnier show that, because of their symmetric nature, there is a very easy characterization for cut-off admitting symmetric protocols.

**Proposition 3.** *([17], Lemma 18) A symmetric protocol* P *admits a cut-off iff there exists an even number* e *and an odd number* o *such that* C<sup>e</sup> init ∗ −→ <sup>C</sup><sup>e</sup> fin *and* Co init ∗ −→ <sup>C</sup><sup>o</sup> fin*.*

From a symmetric protocol P, we can derive a graph G(P) where the vertices are the states and there is an edge between q and q iff there exists a ∈ Σ such that (q, a, q ) ∈ R. The following proposition is immediate from the definition of symmetric protocols:

**Proposition 4.** *Let* P *be a symmetric protocol. There exists an even number* e *such that* C<sup>e</sup> init ∗ −→ <sup>C</sup><sup>e</sup> fin *iff there is a path from init to fin in the graph* G(P)*.*

*Proof.* The left to right implication is obvious. For the other side, suppose there is a path *init*, q1, q2,...,q<sup>m</sup>−<sup>1</sup>, *fin* in the graph <sup>G</sup>(P). Then notice that 2·*init* <sup>→</sup> <sup>2</sup> · <sup>q</sup><sup>1</sup> <sup>→</sup> <sup>2</sup> · <sup>q</sup><sup>2</sup> ···→ <sup>2</sup> · <sup>q</sup><sup>m</sup>−<sup>1</sup> <sup>→</sup> <sup>2</sup> · <sup>q</sup><sup>f</sup> is a valid run of the protocol.

Since graph reachability is in NC , this takes care of the "even" case from Proposition 3. Hence, we only need to take care of the "odd" case from Proposition 3.

Fix a symmetric protocol P for the rest of the section. As a first step, for each state q ∈ Q, we compute if there is a path from *init* to q and if there is a path from q to *fin* in the graph G(P). Since graph reachability is in NC this computation can be carried out in NC by parallely running graph reachability for each q ∈ Q. If such paths exist for a state q then we call q a good state, and otherwise a bad state. The following proposition easily follows from the symmetric nature of P:

**Proposition 5.** *If* <sup>q</sup> <sup>∈</sup> <sup>Q</sup> *is a good state, then* <sup>2</sup> · *init* <sup>∗</sup> −→ <sup>2</sup> · <sup>q</sup> *and* <sup>2</sup> · <sup>q</sup> <sup>∗</sup> −→ <sup>2</sup> · *fin.*

Similar to the general case of rendez-vous protocols, given a symmetric protocol P we can construct a Petri net N<sup>P</sup> whose places are the states of P and which faithfully represents the reachability relation of configurations of P. Observe that this construction can be carried out in parallel over all the states in Q and over all pairs of rules in R. Let N = (P, T, P re, P ost) be the Petri net that we construct out of the symmetric protocol P and let A be its incidence matrix. We now write the marking equation for N as follows: We introduce a variable **v**[t] for each transition t ∈ T and we construct an equation system Eq enforcing the following three conditions:

**– v**[t] = 0 for every t ∈ T such that • t ∪ t • contains a bad state. By definition of a bad state, such transitions will never be fired on any run from an initial to a final configuration and so our requirement is safe.

**–** - <sup>t</sup>∈<sup>T</sup> <sup>A</sup>[q, t] · **<sup>v</sup>**[t] = 0 for each q /∈ {*init*, *fin*}. Notice that the net-effect of any run from an initial to a final configuration on any state not in {*init*, *fin*} is 0 and hence this condition is valid as well. **–** - <sup>t</sup>∈<sup>T</sup> <sup>A</sup>[*init*, t] · **<sup>v</sup>**[t] = <sup>−</sup>1 and - <sup>t</sup>∈<sup>T</sup> <sup>A</sup>[*fin*, t] · **<sup>v</sup>**[t] = 1.

It is clear that the construction of Eq can be carried out in parallel over each q ∈ Q and each t ∈ T. Finally, we solve Eq over arithmetic modulo 2, i.e., we solve Eq over the field F<sup>2</sup> which as mentioned before can be done in NC. We have:

**Lemma 8.** *There exists an odd number* o *such that* C<sup>o</sup> init ∗ −→ <sup>C</sup><sup>o</sup> fin *iff the equation system* Eq *has a solution over* F2*.*

*Proof.* (Sketch.) The left to right implication is true because of taking modulo 2 on both sides of the marking equation. For the other side, we use an idea similar to Lemma 5. Let **x** be a solution to Eq over F2. Using Proposition 5 we first populate all the good states of Q with enough processes such that all the good states except *init* have an even number of processes. Then, we fire exactly once, all the transitions t such that **x**[t] = 1. Since **x** satisfies Eq, we can now argue that in the resulting configuration, the number of processes at each bad state is 0 and the number of processes in each good state except *fin* is even. Hence, we can once again use Proposition 5 to conclude that we can move all the processes which are not at *fin* to the final state *fin*.

**Theorem 6.** *The problem of deciding whether a symmetric protocol admits a cut-off is in* NC*.*

*Proof.* By Proposition 3 it suffices to find an even number e and an odd number o such that C<sup>e</sup> init ∗ −→ <sup>C</sup><sup>e</sup> fin and C<sup>o</sup> init ∗ −→ <sup>C</sup><sup>o</sup> fin. By Proposition 4 the former can be done in NC. By Lemma 8 and by the fact that the equation system Eq can be constructed and solved in NC, it follows that the latter can also be done in NC.

### **7 Symmetric protocols with leaders**

In this section, we extend symmetric rendez-vous protocols by adding a special process called leader. We state the cut-off problem for such protocols and prove that it is NP-complete.

**Definition 4.** *A symmetric leader protocol is a pair of symmetric protocols* P = (P<sup>L</sup>,P<sup>F</sup> ) *where* <sup>P</sup><sup>L</sup> = (Q<sup>L</sup>,Σ, *init*<sup>L</sup>, *fin*<sup>L</sup>, R<sup>L</sup>) *is the leader protocol and* <sup>P</sup><sup>F</sup> <sup>=</sup> (Q<sup>F</sup> ,Σ, *init*<sup>F</sup> , *fin*<sup>F</sup> , R<sup>F</sup> ) *is the follower protocol where* <sup>Q</sup><sup>L</sup> <sup>∩</sup> <sup>Q</sup><sup>F</sup> <sup>=</sup> <sup>∅</sup>*.*

A configuration of a symmetric leader protocol <sup>P</sup> is a multiset over <sup>Q</sup><sup>L</sup> <sup>∪</sup>Q<sup>F</sup> such that - <sup>q</sup>∈Q<sup>L</sup> <sup>C</sup>(q) = 1. This corresponds to the intuition that exactly one process can execute the leader protocol. For each <sup>n</sup> <sup>∈</sup> <sup>N</sup>, let <sup>C</sup><sup>n</sup> init (resp. C<sup>n</sup> fin) denote the initial (resp. final) configuration of <sup>P</sup> given by <sup>C</sup><sup>n</sup> init(*init*<sup>L</sup>) = 1 (resp. C<sup>n</sup> fin(*fin*<sup>L</sup>) = 1) and <sup>C</sup><sup>n</sup> init(*init*<sup>F</sup> ) = n (resp. C<sup>n</sup> fin(*fin*<sup>F</sup> ) = <sup>n</sup>). We say that <sup>C</sup> <sup>=</sup><sup>⇒</sup> <sup>C</sup> if there exists (p, !a, p ),(q, ?a, q ) <sup>∈</sup> <sup>R</sup><sup>L</sup> <sup>∪</sup> <sup>R</sup><sup>F</sup> , <sup>C</sup> <sup>≥</sup> p, q and <sup>C</sup> <sup>=</sup> <sup>C</sup> <sup>−</sup> p, q <sup>+</sup> p , q . Since we allow at most one process to execute the leader protocol, given a configuration <sup>C</sup>, we can let lead(C) denote the unique state <sup>q</sup> <sup>∈</sup> <sup>Q</sup><sup>L</sup> such that C(q) > 0.

**Definition 5.** *The cut-off problem for symmetric leader protocols is the following.*

*Input: A symmetric leader protocol* <sup>P</sup> = (P<sup>L</sup>,P<sup>F</sup> )*. Output: Is there* <sup>B</sup> <sup>∈</sup> <sup>N</sup> *such that for all* <sup>n</sup> <sup>≥</sup> <sup>B</sup>*,* <sup>C</sup><sup>n</sup> init ∗ <sup>=</sup><sup>⇒</sup> <sup>C</sup><sup>n</sup> fin*.*

We know the following fact regarding symmetric leader protocols.

**Proposition 6.** *([17], Lemma 18) A symmetric leader protocol admits a cut-off iff there exists an even number* e *and an odd number* o *such that* C<sup>e</sup> init ∗ <sup>=</sup><sup>⇒</sup> <sup>C</sup><sup>e</sup> fin *and* C<sup>o</sup> init ∗ <sup>=</sup><sup>⇒</sup> <sup>C</sup><sup>o</sup> fin*.*

The main theorem of this section is

**Theorem 7.** *The cut-off problem for symmetric leader protocols is* NP*-complete*

#### **7.1 A non-deterministic polynomial time algorithm**

Let <sup>P</sup> = (P<sup>L</sup>,P<sup>F</sup> ) be a symmetric leader protocol with <sup>P</sup><sup>L</sup> = (Q<sup>L</sup>,Σ, *init*<sup>L</sup>, *fin*<sup>L</sup>, <sup>R</sup><sup>L</sup>) and <sup>P</sup><sup>F</sup> = (Q<sup>F</sup> ,Σ, *init*<sup>F</sup> , *fin*<sup>F</sup> , R<sup>F</sup> ). Similar to the previous section, from <sup>P</sup><sup>F</sup> we can construct a graph <sup>G</sup>(P<sup>F</sup> ) where the vertices are given by the states <sup>Q</sup><sup>F</sup> and the edges are given by the rules in <sup>R</sup><sup>F</sup> . In <sup>G</sup>(P<sup>F</sup> ), we can clearly remove all vertices which are not reachable from the state *init*<sup>F</sup> and which do not have a path to *fin*<sup>F</sup> . In the sequel, we will assume that such vertices do not exist in <sup>G</sup>(P<sup>F</sup> ).

Similar to the general case, we will construct a Petri net N<sup>P</sup> from the given symmetric leader protocol P. However, the construction is made slightly complicated due to the presence of a leader.

From <sup>P</sup> = (P<sup>L</sup>,P<sup>F</sup> ), we construct a Petri net <sup>N</sup> = (P, T, P re, P ost) as follows: Let <sup>P</sup> be <sup>Q</sup><sup>L</sup> <sup>∪</sup> <sup>Q</sup><sup>F</sup> . For each <sup>a</sup> <sup>∈</sup> <sup>Σ</sup> and <sup>r</sup> = (q, !a, s), r = (q , ?a, s ) ∈ <sup>R</sup><sup>L</sup>∪R<sup>F</sup> such that *at most one of* <sup>r</sup> *and* <sup>r</sup> *belongs to* <sup>R</sup><sup>L</sup>, we will have a transition tr,r-∈ T in N such that

**–** Pre[p, t] = 0 for every p /∈ {q, q }, P ost[p, t] = 0 for every p /∈ {s, s } **–** If q = q then Pre[q, t] = −2, otherwise Pre[q, t] = Pre[q , t] = −1 **–** If s = s then P ost[s, t] = 2, otherwise P ost[s, t] = P ost[s , t] = 1.

Transitions tr,r in which exactly one of r, r is in R<sup>L</sup> will be called *leader transitions* and transitions in which both of r, r are in R<sup>F</sup> will be called *followeronly transitions*. Notice that if t is a leader transition, then there is a unique place p ∈ • <sup>t</sup> <sup>∩</sup> <sup>Q</sup><sup>L</sup> and a unique place <sup>p</sup> <sup>∈</sup> <sup>t</sup> • <sup>∩</sup> <sup>Q</sup><sup>L</sup>. These places will be denoted by t.*from* and t.*to* respectively.

As usual, we let A denote the incidence matrix of the constructed net N . The following proposition is obvious from the construction of the net N

**Proposition 7.** *For two configurations* C *and* C *, we have that* C <sup>∗</sup> =⇒ C *in the protocol* <sup>P</sup> *iff* <sup>C</sup> <sup>∗</sup> −→ C *in the net* N *.*

Because P is symmetric we have the following fact, which is easy to verify.

#### **Proposition 8.** *If* <sup>q</sup> <sup>∈</sup> <sup>Q</sup><sup>F</sup> *, then* <sup>2</sup> · *init*<sup>F</sup> <sup>∗</sup> −→ <sup>2</sup> · <sup>q</sup> <sup>∗</sup> −→ <sup>2</sup> · *fin*<sup>F</sup>

For any vector **<sup>x</sup>** <sup>∈</sup> <sup>N</sup><sup>T</sup> , we define lead(**x**) to be the set of all leader transitions such that **x**[t] > 0. The graph of the vector **x**, denoted by G(**x**) is defined as follows: The set of vertices is the set {t.*from* : t ∈ lead(**x**)}∪{t.*to* : t ∈ lead(**x**)}. The set of edges is the set {(t.*from*, t.*to*) : t ∈ lead(**x**)}. Further, for any two vectors **<sup>x</sup>**, **<sup>y</sup>** <sup>∈</sup> <sup>N</sup><sup>T</sup> and a transition <sup>t</sup> <sup>∈</sup> <sup>T</sup>, we say that **<sup>x</sup>** <sup>=</sup> **<sup>y</sup>**[t--] iff **<sup>x</sup>**[t] = **<sup>y</sup>**[t]−<sup>1</sup> and **x**[t ] = **y**[t ] for all t = t.

**Definition 6.** *Let* <sup>C</sup> *be a configuration and let* **<sup>x</sup>** <sup>∈</sup> <sup>N</sup><sup>T</sup> *. We say that the pair* (C, **x**) *is compatible if* C + A**x** ≥ *0 and every vertex in* G(**x**) *is reachable from* lead(C)*.*

The following lemma states that *as long as there are enough followers in every state*, it is possible for the leader to come up with a firing sequence from a compatible pair.

**Lemma 9.** *Suppose* (C, **x**) *is a compatible pair such that* C(q) ≥ 2**x** *for every* <sup>q</sup> <sup>∈</sup> <sup>Q</sup><sup>F</sup> *. Then there is a configuration* <sup>D</sup> *and a firing sequence* <sup>ξ</sup> *such that* <sup>C</sup> <sup>ξ</sup> −→ <sup>D</sup> *and* −→<sup>ξ</sup> <sup>=</sup> **<sup>x</sup>***.*

*Proof.* (Sketch.) We prove by induction on **x**. If **x**[t] > 0 for some follower-only transition, then it is easy to verify that if we let <sup>C</sup> be such that <sup>C</sup> <sup>t</sup> −→ C and **x** be **x**[t--], then (C , **x** ) is compatible and C(q) ≥ 2**x** for every <sup>q</sup> <sup>∈</sup> <sup>Q</sup><sup>F</sup> .

Suppose **x**[t] > 0 for some leader transition. Let p = lead(C). If p belongs to some cycle S = p, r1, p1, r2, p2,...,pk, r<sup>k</sup>+1, p in the graph G(**x**), then we let <sup>C</sup> <sup>r</sup><sup>1</sup> −→ <sup>C</sup> and **<sup>x</sup>** <sup>=</sup> **<sup>x</sup>**[t--]. It is easy to verify that <sup>C</sup> <sup>+</sup> <sup>A</sup>**x** <sup>≥</sup> **<sup>0</sup>**, <sup>C</sup> (q) ≥ 2**x** for every <sup>q</sup> <sup>∈</sup> <sup>Q</sup><sup>F</sup> and lead(C ) = p1. Any path P in G(**x**) from p to some vertex s either goes through p<sup>1</sup> or we can use the cycle S to traverse from p<sup>1</sup> to p first and then use P to reach s. This gives a path from p<sup>1</sup> to every vertex s in G(**x** ).

If p does not belong to any cycle in G(**x**), then using the fact that C+A**x** ≥ 0, we can show that there is exactly one out-going edge t from p in G(**x**). We then let <sup>C</sup> <sup>t</sup> −→ C and **x** = **x**[t--]. Since any path in G(**x**) from p has to necessarily use this edge t, it follows that in G(**x** ) there is a path from t.*to* = lead(C ) to every vertex.

**Lemma 10.** *Let* par ∈ {0, <sup>1</sup>}*. There exists* <sup>k</sup> <sup>∈</sup> <sup>N</sup> *such that* <sup>C</sup><sup>k</sup> init ∗ −→ <sup>C</sup><sup>k</sup> fin *and* <sup>k</sup> <sup>≡</sup> par (*mod* 2) *iff there exists* <sup>n</sup> <sup>∈</sup> <sup>N</sup>, **<sup>x</sup>** <sup>∈</sup> <sup>N</sup><sup>T</sup> *such that* <sup>n</sup> <sup>≡</sup> par (*mod* 2)*,* (C<sup>n</sup> init, **x**) *is compatible and* C<sup>n</sup> fin = C<sup>n</sup> init + A**x***.*

*Proof.* (Sketch.) The left to right implication is easy and follows from the marking equation along with induction on the number of leader transitions in the run. For the other side, we use an idea similar to Lemma 5. Let (C<sup>n</sup> init, **x**) be the given compatible pair. We first use Proposition 8 to populate all the states of Q<sup>F</sup> with enough processes such that all the states of Q<sup>F</sup> except *init*<sup>F</sup> have an even number of processes. Then we use Lemma 9 to construct a firing sequence ξ which can be fired from C<sup>n</sup> init and such that −→ξ = **x**. By means of the marking equation, we then argue that in the resulting configuration, the leader is in the final state, n followers are in the state *fin*<sup>F</sup> and every other follower state has an even number of followers. Once again, using Proposition 8 we can now move all the processes which are not at *fin*<sup>F</sup> to the final state *fin*<sup>F</sup> .

**Lemma 11.** *Given a symmetric leader protocol, checking whether a cut-off exists can be done in NP.*

*Proof.* By Proposition 6 it suffices to find an even number e and an odd number o such that C<sup>e</sup> init ∗ −→ <sup>C</sup><sup>e</sup> fin and C<sup>o</sup> init ∗ −→ <sup>C</sup><sup>o</sup> fin. Suppose we want to check that there exists 2<sup>k</sup> <sup>∈</sup> <sup>N</sup> such that <sup>C</sup><sup>2</sup><sup>k</sup> init ∗ −→ <sup>C</sup><sup>2</sup><sup>k</sup> fin. We first non-deterministically guess a set of leader transitions S = {t1,...,t<sup>k</sup>} and check that for each t ∈ S, we can reach t.*from* and t.*to* from *init*<sup>L</sup> using only the transitions in S.

Once we have guessed all this, we write a polynomially sized integer linear program as follows: We let **v** denote |T| variables, one for each transition in T and we let n be another variable, with all these variables ranging over N. We then enforce the following conditions: C<sup>2</sup><sup>n</sup> fin = C<sup>2</sup><sup>n</sup> init + A**v** and **v**[t]=0 ⇐⇒ t /∈ S and solve the resulting linear program, which we can do in non-deterministic polynomial time [26]. If there exists a solution, then we accept. Otherwise, we reject.

By Lemma 10 and by the definition of compatibility, it follows that at least one of our guesses gets accepted iff there exists 2<sup>k</sup> <sup>∈</sup> <sup>N</sup> such that <sup>C</sup><sup>2</sup><sup>k</sup> init ∗ −→ <sup>C</sup><sup>2</sup><sup>k</sup> fin. Similarly we can check if exists 2<sup>l</sup> + 1 <sup>∈</sup> <sup>N</sup> such that <sup>C</sup><sup>2</sup>l+1 init ∗ −→ <sup>C</sup><sup>2</sup>l+1 fin .

By a reduction from 3-SAT, we prove that

**Lemma 12.** *The cut-off problem for symmetric leader protocols is NP-hard.*

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Fixpoint Theory – Upside Down**

Paolo Baldan<sup>1</sup> , Richard Eggert2(-) , Barbara K¨onig<sup>2</sup> , and Tommaso Padoan<sup>1</sup>

<sup>1</sup> Universit`a di Padova, Padova, Italy <sup>2</sup> Universit¨at Duisburg-Essen, Duisburg, Germany richard.eggert@uni-due.de

**Abstract.** Knaster-Tarski's theorem, characterising the greatest fixpoint of a monotone function over a complete lattice as the largest postfixpoint, naturally leads to the so-called coinduction proof principle for showing that some element is below the greatest fixpoint (e.g., for providing bisimilarity witnesses). The dual principle, used for showing that an element is above the least fixpoint, is related to inductive invariants. In this paper we provide proof rules which are similar in spirit but for showing that an element is above the greatest fixpoint or, dually, below the least fixpoint. The theory is developed for non-expansive monotone functions on suitable lattices of the form M<sup>Y</sup> , where Y is a finite set and M an MV-algebra, and it is based on the construction of (finitary) approximations of the original functions. We show that our theory applies to a wide range of examples, including termination probabilities, behavioural distances for probabilistic automata and bisimilarity. Moreover it allows us to determine original algorithms for solving simple stochastic games.

### **1 Introduction**

Fixpoints are ubiquitous in computer science as they allow to provide a meaning to inductive and coinductive definitions (see, e.g., [26,23]). A monotone function f : L → L over a complete lattice (L, ), by Knaster-Tarski's theorem [28], admits a least fixpoint μf and greatest fixpoint νf which are characterised as the least pre-fixpoint and the greatest post-fixpoint, respectively. This immediately gives well-known proof principles for showing that a lattice element l ∈ L is *below* νf or *above* μf

$$\frac{l \sqsubseteq f(l)}{l \sqsubseteq \nu f} \qquad\qquad \frac{f(l) \sqsubseteq l}{\mu f \sqsubseteq l}$$

On the other hand, showing that a given element l is *above* νf or *below* μf is more difficult. One can think of using the characterisation of least and largest fixpoints via Kleene's iteration. E.g., the largest fixpoint is the least element of the (possibly transfinite) descending chain obtained by iterating f from . Then showing that f<sup>i</sup> () l for some i, one concludes that νf l. This proof principle is related to the notion of ranking functions. However, this is a less satisfying notion of witness since f has to be applied i times, and this can be inefficient or unfeasible when i is an infinite ordinal.

<sup>©</sup> The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 62–81, 2021. https://doi.org/10.1007/978-3-030-71995-1\_4

The aim of this paper is to present an alternative proof rule for this purpose for functions over lattices of the form L = M<sup>Y</sup> where Y is a finite set and M is an MV-chain, i.e., a totally ordered complete lattice endowed with suitable operations of sum and complement. This allows us to capture several examples, ranging from ordinary relations, for dealing with bisimilarity, behavioural metrics, termination probabilities and simple stochastic games.

Assume <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Y</sup> monotone and consider the question of proving that some fixpoint <sup>a</sup> : <sup>Y</sup> <sup>→</sup> <sup>M</sup> is the largest fixpoint νf. The idea is to show that there is no "slack" or "wiggle room" in the fixpoint a that would allow us to further increase it. This is done by associating with every <sup>a</sup> : <sup>Y</sup> <sup>→</sup> <sup>M</sup> a function f# <sup>a</sup> on **2**<sup>Y</sup> whose greatest fixpoint gives us the elements of Y where we have a potential for increasing a by adding a constant. If no such potential exists, i.e. νf# <sup>a</sup> is empty, we conclude that a is νf. A similar function f <sup>a</sup> # (specifying decrease instead of increase) exists for the case of least fixpoints. Note that the premise is νf<sup>a</sup> # = ∅, i.e. the witness remains coinductive. The proof rules are:

$$\frac{f(a) = a \qquad \nu f\_a^\# = \emptyset}{\nu f = a} \qquad \frac{f(a) = a \qquad \nu f\_\#^a = \emptyset}{\mu f = a}$$

For applying the rule we compute a greatest fixpoint on **2**<sup>Y</sup> , which is finite, instead of working on the potentially infinite M<sup>Y</sup> . The rule does not work for all monotone functions <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Y</sup> , but we show that whenever <sup>f</sup> is nonexpansive the rule is valid. Actually, it is not only sound, but also reversible, i.e., if a = νf then νf# <sup>a</sup> = ∅, providing an if-and-only-if characterisation.

Quite interestingly, under the same assumptions on f, using a restricted function f <sup>∗</sup> <sup>a</sup> , the rule can be used, more generally, when a is just a *pre-fixpoint* (f(a) a) and it allows to conclude that νf a. A dual result holds for *postfixpoints* in the case of least fixpoints.

$$\frac{\begin{array}{c} f(a) \sqsubseteq a \\ \nu f \sqsubseteq a \end{array}}{\nu f \sqsubseteq a} \quad \begin{array}{c} a \sqsubseteq f(a) \\ \hline a \sqsubseteq \mu f \end{array} \quad \begin{array}{c} \nu f^{a}\_{\*} = \emptyset \\ a \sqsubseteq \mu f \end{array}$$

As already mentioned, the theory above applies to many interesting scenarios: witnesses for non-bisimilarity, algorithms for simple stochastic games [11] and lower bounds for termination probabilities and behavioural metrics in the setting of probabilistic systems [1] and probabilistic automata [2]. In particular we were inspired by, and generalise, the self-closed relations of Fu [16], also used in [2].

*Motivating Example.* Consider a Markov chain (S, T, η) with a finite set of states S, where T ⊆ S are the terminal states and every state s ∈ S\T is associated with a probability distribution <sup>η</sup>(s) ∈ D(S).<sup>3</sup> Intuitively, <sup>η</sup>(s)(s ) denotes the probability of state s choosing s as its successor. Assume that, given a fixed state s ∈ S, we want to determine the termination probability of s, i.e. the probability of reaching any terminal state from s. As a concrete example, take the Markov chain given in Fig. 1, where u is the only terminal state.

<sup>3</sup> <sup>D</sup>(S) is the set of all maps <sup>p</sup> : <sup>S</sup> <sup>→</sup> [0, 1] such that - <sup>s</sup>∈<sup>S</sup> <sup>p</sup>(s) = 1.

$$\begin{aligned} \mathcal{T}: [0,1]^S &\to [0,1]^S\\ \mathcal{T}(t)(s) &= \begin{cases} 1 & \text{if } v \in T\\ \sum\end{cases} \eta(s)(s') \cdot t(s') &\text{otherwise} \end{cases} \quad \underbrace{\sum\limits\_{0/1}^1 \sum\limits\_{1} \eta}\_{0/1} \underbrace{\frac{1}{3} \quad \bigcup\limits\_{\frac{1}{2}/1}^{\frac{1}{3}} \left(\sum\limits\_{1}\right)^{\frac{1}{3}}}\_{\frac{1}{2}/1} \cdots} \end{aligned}$$

Fig. 1: Function T (left) and a Markov chain with two fixpoints of T (right)

The termination probability arises as the least fixpoint of a function T defined as in Fig. 1. The values of μT are indicated in green (left value).

Now consider the function t assigning to each state the termination probability written in red (right value). It is not difficult to see that t is another fixpoint of T , in which states y and z convince each other incorrectly that they terminate with probability 1, resulting in a vicious cycle that gives "wrong" results. We want to show that μT = t without knowing μT . Our idea is to compute the set of states that still has some "wiggle room", i.e., those states which could reduce their termination probability by δ if all their successors did the same. This definition has a coinductive flavour and it can be computed as a greatest fixpoint on the finite powerset **2**<sup>S</sup> of states, instead of on the infinite lattice S[0,1]. t t

We hence consider a function <sup>T</sup> <sup>t</sup> # : **<sup>2</sup>**[S] <sup>→</sup> **<sup>2</sup>**[S] , dependent on t, defined as follows. Let [S] <sup>t</sup> be the set of all states s where t(s) > 0, i.e., a reduction is in principle possible. Then a state s ∈ [S] <sup>t</sup> is in <sup>T</sup> <sup>t</sup> #(S ) iff s ∈ T and for all s for which η(s)(s ) > 0 it holds that s ∈ S , i.e. all successors of s are in S .

The greatest fixpoint of <sup>T</sup> <sup>t</sup> # is {y, z}. The fact that it is not empty means that there is some "wiggle room", i.e., the value of t can be reduced on the elements {y, z} and thus t cannot be the least fixpoint of f. Moreover, the intuition that t can be improved on {y, z} can be made precise, leading to the possibility of performing the improvement and search for the least fixpoint from there.

*Contributions.* In the paper we formalise the theory outlined above, showing that the proof rules work for non-expansive monotone functions f on lattices of the form M<sup>Y</sup> , where Y is a finite set and M an MV-algebra (§3 and §4). Additionally, given a decomposition of f we show how to obtain the corresponding approximation compositionally (§5). Then, in order to show that our approach covers a wide range of examples and allows us to derive original algorithms, we discuss various applications: termination probability, behavioural distances for probabilistic automata and bisimilarity (§6) and simple stochastic games (§7).

Proofs and further material can be found in the full version of the paper [5].

#### **2 Lattices and MV-Algebras**

In this section, we review some basic notions used in the paper.

A preordered or partially ordered set (P, ) is often denoted simply as P, omitting the order relation. Given x, y ∈ P, with x y, we denote by [x, y] the interval {z ∈ P | x z y}. The *join* and the *meet* of a subset X ⊆ P (if they exist) are denoted X and X, respectively.

A *complete lattice* is a partially ordered set (L, ) such that each subset X ⊆ L admits a join X and a meet X. A complete lattice (L, ) always has a least element ⊥ = ∅ and a greatest element = ∅.

A function f : L → L is *monotone* if for all l,l ∈ L, if l l then f(l) f(l ). By Knaster-Tarski's theorem [28, Thm. 1], any monotone function on a complete lattice has a least and a greatest fixpoint, denoted respectively μf and νf, characterised as the meet of all pre-fixpoints respectively the join of all post-fixpoints: μf = {l | f(l) l} and νf = {l | l f(l)}.

Let (C, ), (A, ≤) be complete lattices. A *Galois connection* is a pair of monotone functions α, γ such that α : C → A, γ : A → C and for all a ∈ A and c ∈ C: α(c) ≤ a ⇐⇒ c γ(a). Equivalently, for all a ∈ A and c ∈ C, (i) c γ(α(c)) and (ii) α(γ(a)) ≤ a. In this case we will write α, γ : C → A. For a Galois connection α, γ : C → A, the function α is called the left (or lower) adjoint and γ the right (or upper) adjoint.

Galois connections are at the heart of abstract interpretation [13,14]. In particular, when α, γ is a Galois connection, given <sup>f</sup> <sup>C</sup> : <sup>C</sup> <sup>→</sup> <sup>C</sup> and <sup>f</sup> <sup>A</sup> : <sup>A</sup> <sup>→</sup> <sup>A</sup>, monotone functions, if <sup>f</sup> <sup>C</sup> ◦ <sup>γ</sup> <sup>γ</sup> ◦ <sup>f</sup> <sup>A</sup>, then νf <sup>C</sup> <sup>γ</sup>(νf <sup>A</sup>). If equality holds, i.e., <sup>f</sup> <sup>C</sup> ◦ <sup>γ</sup> <sup>=</sup> <sup>γ</sup> ◦ <sup>f</sup> <sup>A</sup>, then greatest fixpoints are preserved along the connection, i.e., νf <sup>C</sup> = γ(νf <sup>A</sup>).

Given a set <sup>Y</sup> and a complete lattice <sup>L</sup>, the set of functions <sup>L</sup><sup>Y</sup> <sup>=</sup> {<sup>f</sup> <sup>|</sup> <sup>f</sup> : <sup>Y</sup> <sup>→</sup> <sup>L</sup>}, endowed with pointwise order, i.e., for a, b <sup>∈</sup> <sup>L</sup><sup>Y</sup> , <sup>a</sup> <sup>b</sup> if <sup>a</sup>(y) <sup>b</sup>(y) for all y ∈ Y , is a complete lattice.

In the paper we will mostly work with lattices of the kind M<sup>Y</sup> where M is a special kind of lattice with a rich algebraic structure, i.e. an MV-algebra [21].

**Definition 1 (MV-algebra).** *An* MV-algebra *is a tuple* <sup>M</sup> = (M, <sup>⊕</sup>, <sup>0</sup>,(·)) *where* (M, ⊕, 0) *is a commutative monoid and* (·) : M → M *maps each element to its* complement*, such that for all* x, y ∈ M *(1)* x = x*; (2)* x⊕0 = 0*; (3)* (x ⊕ y)⊕ y = (y ⊕ x) ⊕ x*.*

*We denote* 1 = 0*, multiplication* x⊗y = x ⊕ y *and subtraction* xy = x⊗y*.*

**Definition 2 (natural order).** *Let* <sup>M</sup> = (M, <sup>⊕</sup>, <sup>0</sup>,(·)) *be an MV-algebra. The* natural order *on* <sup>M</sup> *is defined, for* x, y <sup>∈</sup> <sup>M</sup>*, by* <sup>x</sup> <sup>y</sup> *if* <sup>x</sup> <sup>⊕</sup> <sup>z</sup> <sup>=</sup> <sup>y</sup> *for some* <sup>z</sup> <sup>∈</sup> <sup>M</sup>*. When is total* <sup>M</sup> *is called an* MV-chain*.*

The natural order gives an MV-algebra a lattice structure where ⊥ = 0, = 1, x y = (x y) ⊕ y and x y = x y = x ⊗ (x ⊕ y). We call the MV-algebra *complete*, if it is a complete lattice, which is not true in general, e.g., ([0, 1] <sup>∩</sup> <sup>Q</sup>, <sup>≤</sup>).

*Example 3.* A prototypical example of an MV-algebra is ([0, 1], ⊕, 0,(·)) where x ⊕ y = min{x + y, 1} and x = 1 − x for x, y ∈ [0, 1]. This means that x ⊗ y = max{x + y − 1, 0} and x y = max{0, x − y} (truncated subtraction). The operators ⊕ and ⊗ are also known as strong disjunction and conjunction in Lukasiewicz logic [22]. The natural order is ≤ (less or equal) on the reals.

Another example is ({0,...,k}, ⊕, 0,(·)) where n ⊕ m = min{n + m, k} and n = k−n for n, m ∈ {0,...,k}. Both MV-algebras are complete and MV-chains.

Boolean algebras (with disjunction and complement) also form MV-algebras that are complete, but in general not MV-chains.

MV-algebras are the algebraic semantics of Lukasiewicz logic. They can be shown to correspond to intervals of the kind [0, u] in suitable groups, i.e., abelian lattice-ordered groups with a strong unit u [21].

### **3 Non-expansive Functions and Their Approximations**

As mentioned in the introduction, our interest is for fixpoints of monotone functions <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Y</sup> , where <sup>M</sup> is an MV-chain and <sup>Y</sup> is a finite set. We will see that for non-expansive functions we can over-approximate the sets of points in which a given <sup>a</sup> <sup>∈</sup> <sup>M</sup><sup>Y</sup> can be increased in a way that is preserved by the application of f. This will be the core of the proof rules outlined earlier.

*Non-expansive Functions on MV-Algebras.* For defining non-expansiveness it is convenient to introduce a norm.

**Definition 4 (norm).** *Let* M *be an MV-chain and let* Y *be a finite set. Given* <sup>a</sup> <sup>∈</sup> <sup>M</sup><sup>Y</sup> *we define its* norm *as* ||a|| = max{a(y) <sup>|</sup> <sup>y</sup> <sup>∈</sup> <sup>Y</sup> }*.*

Given a finite set <sup>Y</sup> we extend <sup>⊕</sup> and <sup>⊗</sup> to <sup>M</sup><sup>Y</sup> pointwise. Given <sup>Y</sup> <sup>⊆</sup> <sup>Y</sup> and <sup>δ</sup> <sup>∈</sup> <sup>M</sup>, we write <sup>δ</sup><sup>Y</sup> for the function defined by δ<sup>Y</sup> - (y) = δ if y ∈ Y and δ<sup>Y</sup> - (y) = 0, otherwise. Whenever this does not generate confusion, we write δ instead of <sup>δ</sup><sup>Y</sup> . It can be seen that ||·|| has the properties of a norm, i.e., for all a, b <sup>∈</sup> <sup>M</sup><sup>Y</sup> and <sup>δ</sup> <sup>∈</sup> <sup>M</sup>, it holds that (1) ||<sup>a</sup> <sup>⊕</sup> <sup>b</sup>|| ||a|| ⊕ ||b||, (2) ||<sup>δ</sup> <sup>⊗</sup> <sup>a</sup>|| <sup>=</sup> <sup>δ</sup> ⊗ ||a|| and and ||a|| = 0 implies that a is the constant 0. Moreover, it is clearly monotonic, i.e., if a b then ||a|| ||b||.

We next introduce non-expansiveness. Despite the fact that we will finally be interested in endo-functions <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Y</sup> , in order to allow for a compositional reasoning we work with functions where domain and codomain can be different.

**Definition 5 (non-expansiveness).** *Let* <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Z</sup> *be a function, where* M *is an MV-chain and* Y,Z *are finite sets. We say that it is* non-expansive *if for all* a, b <sup>∈</sup> <sup>M</sup><sup>Y</sup> *it holds* ||f(b) <sup>f</sup>(a)|| ||<sup>b</sup> <sup>a</sup>||*.*

Note that (a, b) → ||a b|| is the supremum lifting of a directed version of Chang's distance [21]. It is easy to see that all non-expansive functions on MVchains are monotone.

*Approximating the Propagation of Increases.* Let <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Z</sup> be a monotone function and take a, b <sup>∈</sup> <sup>M</sup><sup>Y</sup> with <sup>a</sup> <sup>b</sup>. We are interested in the difference b(y) a(y) for some y ∈ Y and on how the application of f "propagates" this increase. The reason is that, understanding that no increase can be propagated will be crucial to establish when a fixpoint of a non-expansive function f is actually the largest one, and, more generally, when a (pre-)fixpoint of f is above the largest fixpoint.

In order to formalise the above intuition, we rely on tools from abstract interpretation. In particular, the following pair of functions, which, under a suitable condition, form a Galois connection, will play a major role. The left adjoint αa,δ takes as input a set Y and, for y ∈ Y , it increases the values a(y) by δ, while the right adjoint <sup>γ</sup>a,δ takes as input a function <sup>b</sup> <sup>∈</sup> <sup>M</sup><sup>Y</sup> , <sup>b</sup> <sup>∈</sup> [a, a <sup>⊕</sup> <sup>δ</sup>] and checks for which parameters y ∈ Y the value b(y) exceeds a(y) by δ.

We also define [Y ]a, the subset of elements in Y where a(y) is not 1 and thus there is a potential to increase, and δa, which gives us the minimal such increase.

**Definition 6 (functions to sets, and vice versa).** *Let* M *be an MV-algebra and let* Y *be a finite set. Define the set* [Y ]<sup>a</sup> = {y ∈ Y | a(y) = 1} *and* δ<sup>a</sup> = min{a(y) | y ∈ [Y ]<sup>a</sup>} *with* min ∅ = 1*.*

*For* <sup>0</sup> <sup>δ</sup> <sup>∈</sup> <sup>M</sup> *we consider the functions* <sup>α</sup>a,δ : **<sup>2</sup>**[<sup>Y</sup> ]<sup>a</sup> <sup>→</sup> [a, a <sup>⊕</sup> <sup>δ</sup>] *and* <sup>γ</sup>a,δ : [a, a <sup>⊕</sup> <sup>δ</sup>] <sup>→</sup> **<sup>2</sup>**[<sup>Y</sup> ]<sup>a</sup> *, defined, for* <sup>Y</sup> <sup>∈</sup> **<sup>2</sup>**[<sup>Y</sup> ]<sup>a</sup> *and* <sup>b</sup> <sup>∈</sup> [a, a <sup>⊕</sup> <sup>δ</sup>]*, by*

$$
\alpha\_{a,\delta}(Y') = a \oplus \delta\_{Y'} \qquad \gamma\_{a,\delta}(b) = \{ y \in [Y]\_a \mid b(y) \ominus a(y) \sqsubseteq \delta \}.
$$

When δ is sufficiently small, the pair αa,δ, γa,δ is a Galois connection.

**<sup>2</sup>**[<sup>Y</sup> ]<sup>a</sup> [a, a <sup>⊕</sup> <sup>δ</sup>] αa,δ γa,δ **Lemma 7 (Galois connection).** *Let* M *be an MV-algebra and* Y *be a finite set. For* 0 = δ δa*, the pair* αa,δ, γa,δ : **<sup>2</sup>**[<sup>Y</sup> ]<sup>a</sup> <sup>→</sup> [a, a <sup>⊕</sup> <sup>δ</sup>] *is a Galois connection.*

Whenever f is non-expansive, it is easy to see that it restricts to a function <sup>f</sup> : [a, a <sup>⊕</sup> <sup>δ</sup>] <sup>→</sup> [f(a), f(a) <sup>⊕</sup> <sup>δ</sup>] for all <sup>δ</sup> <sup>∈</sup> <sup>M</sup>.

As mentioned before, a crucial result shows that for all non-expansive functions, under the assumption that Y,Z are finite and the order on M is total, we can suitably approximate the propagation of increases. In order to state this result, a useful tool is a notion of approximation of a function.

**Definition 8 (**(δ, a)**-approximation).** *Let* M *be an MV-chain, let* Y *,* Z *be finite sets and let* <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Z</sup> *be a non-expansive function. For* <sup>a</sup> <sup>∈</sup> <sup>M</sup><sup>Y</sup> *and any* <sup>δ</sup> <sup>∈</sup> <sup>M</sup> *we define* <sup>f</sup># a,δ : **<sup>2</sup>**[<sup>Y</sup> ]<sup>a</sup> <sup>→</sup> **<sup>2</sup>**[Z]f(a) *as* <sup>f</sup># a,δ = γ<sup>f</sup>(a),δ ◦ f ◦ αa,δ*.*

Given <sup>Y</sup> <sup>⊆</sup> [<sup>Y</sup> ]a, its image <sup>f</sup># a,δ(Y ) ⊆ [Z]<sup>f</sup>(a) is the set of points z ∈ [Z]<sup>f</sup>(a) such that δ f(a ⊕ δ<sup>Y</sup> - )(z) f(a)(z), i.e., the points to which f propagates an increase of the function a with value δ on the subset Y .

We first show that f# a,δ is antitone in the parameter δ, a non-trivial result.

**Lemma 9 (anti-monotonicity).** *Let* M *be an MV-chain, let* Y *,* Z *be finite sets, let* <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Z</sup> *be a non-expansive function and let* <sup>a</sup> <sup>∈</sup> <sup>M</sup><sup>Y</sup> *. For* θ, δ <sup>∈</sup> <sup>M</sup>*, if* <sup>θ</sup> <sup>δ</sup> *then* <sup>f</sup># a,δ <sup>⊆</sup> <sup>f</sup># a,θ*.*

Since f# a,δ increases when δ decreases and there are finitely many such functions, there must be a value ι f <sup>a</sup> such that all functions f# a,δ for 0 <sup>δ</sup> <sup>ι</sup> f <sup>a</sup> are equal. This function is denoted by f# <sup>a</sup> and is called the a*-approximation of* f.

We next show that indeed, for all non-expansive functions, the a-approximation properly approximates the propagation of increases.

**Theorem 10 (approximation of non-expansive functions).** *Let* M *be a complete MV-chain, let* Y,Z *be finite sets and let* <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Z</sup> *be a nonexpansive function. Then there exists* ι f <sup>a</sup> <sup>∈</sup> <sup>M</sup>*, the largest value below or equal to* δ<sup>a</sup> *such that* f# a,δ = f# a,δ *for all* <sup>0</sup> δ, δ <sup>ι</sup> f a*.*

*We denote this function by* f# <sup>a</sup> *and call it the* <sup>a</sup>-approximation *of* <sup>f</sup>*. Then for all* <sup>0</sup> <sup>δ</sup> <sup>∈</sup> <sup>M</sup>*: a.* <sup>γ</sup><sup>f</sup>(a),δ ◦ <sup>f</sup> <sup>⊆</sup> <sup>f</sup># <sup>a</sup> ◦ γa,δ *b. for* δ δa*:* δ ι f <sup>a</sup> *iff* <sup>γ</sup><sup>f</sup>(a),δ ◦ <sup>f</sup> <sup>=</sup> <sup>f</sup># <sup>a</sup> ◦ γa,δ [a, a ⊕ δ] f -- <sup>γ</sup>a,δ **2**[<sup>Y</sup> ]<sup>a</sup> f# a -- [f(a), f(a) ⊕ δ] <sup>γ</sup>f(a),δ **2**[Z]f(a)

Note that if Y = Z and a is a fixpoint of f, i.e., a = f(a), condition (a) above corresponds exactly to soundness in the sense of abstract interpretation [13], while condition (b) corresponds to (γ-)completeness (see also §2).

### **4 Proof Rules**

In this section we formalise the proof technique outlined in the introduction for showing that a fixpoint is the largest and, more generally, for checking overapproximations of greatest fixpoints of non-expansive functions.

Consider a monotone function <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Y</sup> for some finite set <sup>Y</sup> . We first focus on the problem of establishing whether some given fixpoint a of f coincides with νf (without explicitly knowing νf), and, in case it does not, finding an "improvement", i.e., a post-fixpoint of f, larger than a. Observe that when a is a fixpoint, [Y ]<sup>a</sup> = [Y ]<sup>f</sup>(a) and thus the a-approximation of f (Thm. 10) is an endofunction f# <sup>a</sup> : [Y ]<sup>a</sup> → [Y ]a. We have the following result, which relies on the fact that due to Thm. 10 γa,δ preserves fixpoints (of f and f# <sup>a</sup> ).

**Theorem 11 (soundness and completeness for fixpoints).** *Let* M *be a complete MV-chain,* <sup>Y</sup> *a finite set and* <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Y</sup> *be a non-expansive function. Let* <sup>a</sup> <sup>∈</sup> <sup>M</sup><sup>Y</sup> *be a fixpoint of* <sup>f</sup>*. Then* νf# <sup>a</sup> = ∅ *if and only if* a = νf*.*

Whenever a is a fixpoint, but not yet the largest fixpoint of f, we can increase it and obtain a post-fixpoint.

**Lemma 12.** *Let* <sup>M</sup> *be a complete MV-chain,* <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Y</sup> *a non-expansive function,* <sup>a</sup> <sup>∈</sup> <sup>M</sup> *a fixpoint of* <sup>f</sup>*, and let* <sup>f</sup> # <sup>a</sup> *be the corresponding* a*-approximation and* ι f <sup>a</sup> *as in Thm. 10. Then* αa,ι<sup>f</sup> <sup>a</sup> (νf# <sup>a</sup> ) = a ⊕ (ι f <sup>a</sup>)νf# <sup>a</sup> *is a post-fixpoint of* f*.*

Using these results one can perform an alternative fixpoint iteration where we iterate to the largest fixpoint from below: start with a post-fixpoint a<sup>0</sup> f(a0) (which is clearly below νf) and obtain, by (possibly transfinite) iteration, an ascending chain that converges to a, the least fixpoint above a0. Now check with Thm. 11 whether Y = νf# <sup>a</sup> = ∅. If yes, we have reached νf = a. If not,

αa,θ

αa,ι<sup>f</sup> <sup>a</sup> (Y ) = a⊕(ι f <sup>a</sup>)<sup>Y</sup> is again a post-fixpoint (cf. Lem. 12) and we continue this procedure until – for some ordinal – we reach the largest fixpoint νf, for which we have νf# νf = ∅.

Interestingly, the soundness result in Thm. 11 can be generalised to the case in which a is a pre-fixpoint instead of a fixpoint. In this case, the a-approximation for a function <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Y</sup> is a function <sup>f</sup># <sup>a</sup> : [Y ]<sup>a</sup> → [Y ]<sup>f</sup>(a) where domain and codomain are different, hence it would not be meaningful to look for fixpoints. However, as explained below, it can be restricted to an endofunction.

**Theorem 13 (soundness for pre-fixpoints).** *Let* M *be a complete MV-chain,* <sup>Y</sup> *a finite set and* <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Y</sup> *be a non-expansive function. Given a prefixpoint* <sup>a</sup> <sup>∈</sup> <sup>M</sup><sup>Y</sup> *of* <sup>f</sup>*, let* [<sup>Y</sup> ]<sup>a</sup>=f(a) <sup>=</sup> {<sup>y</sup> <sup>∈</sup> [<sup>Y</sup> ]<sup>a</sup> <sup>|</sup> <sup>a</sup>(y) = <sup>f</sup>(a)(y)}*. Let us define* f <sup>∗</sup> <sup>a</sup> : [Y ]<sup>a</sup>=f(a) → [Y ]<sup>a</sup>=f(a) *as* f <sup>∗</sup> <sup>a</sup> (Y ) = f# <sup>a</sup> (Y ) <sup>∩</sup> [<sup>Y</sup> ]<sup>a</sup>=f(a)*, where* <sup>f</sup># <sup>a</sup> : **<sup>2</sup>**[<sup>Y</sup> ]<sup>a</sup> <sup>→</sup> **2**[<sup>Y</sup> ]f(a) *is the* a*-approximation of* f*. If* νf <sup>∗</sup> <sup>a</sup> = ∅ *then* νf a*.*

Roughly, the intuition for the above result is the following: the value of f(a) on some y might or might not depend "circularly" on the value of a on y itself. In a purely inductive setting, without such circular dependencies, μf = νf and hence a being a pre-fixpoint means that we over-approximate νf. However, we might have vicious cycles, as explained in the introduction, that destroy the over-approximation since the values are too low. Now, since we restrict to nonexpansive functions, it must be the case that there is a cycle, such that all elements on this cycle are points where a and f(a) coincide. It is hence sufficient to check whether a given pre-fixpoint could be increased on its subpart which corresponds to a fixpoint, i.e., the idea is to restrict to [Y ]<sup>a</sup>=f(a). We detect such situations by looking for "wiggle room" as for fixpoints.

Completeness does not generalise to pre-fixpoints, i.e., it is not true that if a is a pre-fixpoint of f and νf a then νf <sup>∗</sup> <sup>a</sup> = ∅. A pre-fixpoint might contain slack even though it is above the greatest fixpoint. A counterexample is in Ex. 25.

*The Dual View for Least Fixpoints.* The theory developed so far can be easily dualised to check under-approximations of least fixpoints. Given a complete MValgebra <sup>M</sup> = (M, <sup>⊕</sup>, <sup>0</sup>,(·)) and a monotone function <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Y</sup> , in order to show that a post-fixpoint <sup>a</sup> <sup>∈</sup> <sup>M</sup><sup>Y</sup> satisfies <sup>a</sup> μf, we can in fact simply work in the dual MV-algebra, <sup>M</sup>op = (M, , <sup>⊗</sup>,(·), 1). It is convenient to formulate the conditions using and the original order.

We next outline the dualised setting. The notation for the dual case is obtained from that of the original (primal) case, exchanging subscripts and superscripts.

**2**[<sup>Y</sup> ] <sup>a</sup> [<sup>a</sup> θ, a] γa,θ Given <sup>a</sup> <sup>∈</sup> <sup>M</sup><sup>Y</sup> , define [<sup>Y</sup> ] <sup>a</sup> <sup>=</sup> {<sup>y</sup> <sup>∈</sup> <sup>Y</sup> <sup>|</sup> <sup>a</sup>(y) = 0} and <sup>δ</sup><sup>a</sup> = min{a(y) <sup>|</sup> <sup>y</sup> <sup>∈</sup> [<sup>Y</sup> ] <sup>a</sup>}. For <sup>θ</sup> <sup>∈</sup> <sup>M</sup>, we consider the pair of functions αa,θ, γa,θ : **<sup>2</sup>**[<sup>Y</sup> ] a → [a θ, a] where, for <sup>Y</sup> <sup>∈</sup> **<sup>2</sup>**[<sup>Y</sup> ] a , we let αa,θ(Y ) = a θ<sup>Y</sup> and, for <sup>b</sup> <sup>∈</sup> [<sup>a</sup> θ, a], <sup>γ</sup>a,θ(b) = {y ∈ Y | a(y) b(y) θ}.

A function <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Z</sup> is non-expansive in the dual MV-algebra when it is in the primal one. Its approximation in the sense of Thm. 10 is denoted f<sup>a</sup> #. Table 1: Basic functions <sup>f</sup> : <sup>M</sup><sup>Y</sup> <sup>→</sup> <sup>M</sup><sup>Z</sup> (constant, reindexing, minimum, maximum, average), function composition, disjoint union and the corresponding approximations f# <sup>a</sup> : **<sup>2</sup>**[<sup>Y</sup> ]<sup>a</sup> <sup>→</sup> **<sup>2</sup>**[Z]f(a) , <sup>f</sup><sup>a</sup> # : **<sup>2</sup>**[<sup>Y</sup> ] a <sup>→</sup> **<sup>2</sup>**[Z] f(a) .

Notation: <sup>R</sup><sup>−</sup><sup>1</sup>(z) = {<sup>y</sup> <sup>∈</sup> <sup>Y</sup> <sup>|</sup> <sup>y</sup>Rz}, supp(p) = {<sup>y</sup> <sup>∈</sup> <sup>Y</sup> <sup>|</sup> <sup>p</sup>(y) <sup>&</sup>gt; <sup>0</sup>} for <sup>p</sup> ∈ D(<sup>Y</sup> ), Min<sup>a</sup> <sup>=</sup> {<sup>y</sup> <sup>∈</sup> <sup>Y</sup> <sup>|</sup> <sup>a</sup>(y) minimal}, Max <sup>a</sup> <sup>=</sup> {<sup>y</sup> <sup>∈</sup> <sup>Y</sup> <sup>|</sup> <sup>a</sup>(y) maximal}, <sup>a</sup>: <sup>Y</sup> <sup>→</sup> <sup>M</sup>


Then the dualisations of Thm. 11 and 13 hold, i.e., if a is a fixpoint of f, then νf<sup>a</sup> # <sup>=</sup> <sup>∅</sup> iff μf <sup>=</sup> <sup>a</sup>, and whenever <sup>a</sup> is a post-fixpoint, νf<sup>a</sup> <sup>∗</sup> <sup>=</sup> <sup>∅</sup> implies <sup>a</sup> μf.

### **5 (De)Composing Functions and Approximations**

Given a non-expansive function f and a (pre/post-)fixpoint a, it is often nontrivial to determine the corresponding approximations. However, non-expansive functions enjoy good closure properties (closure under composition, and closure under disjoint union) and we will see that the same holds for the corresponding approximations. Furthermore it turns out that the functions needed in the applications can be obtained from just a few templates. This gives us a toolbox for assembling approximations with relative ease.

**Theorem 14.** *All basic functions listed in Table 1 are non-expansive. Furthermore non-expansive functions are closed under composition and disjoint union. The approximations are the ones listed in the third column of the table.*

#### **6 Applications**

#### **6.1 Termination Probability**

We start by making the example from the introduction (§1) more formal. Consider a Markov chain (S, T, η), as defined in the introduction (Fig. 1), where we restrict the codomain of η : S\T → D(S) to D ⊆ D(S), where D is finite (to ensure that all involved sets are finite). Furthermore let <sup>T</sup> : [0, 1]<sup>S</sup> <sup>→</sup> [0, 1]<sup>S</sup> be the function from the introduction whose least fixpoint μT assigns to each state its termination probability.

**Lemma 15.** *The function* T *can be written as* T = (η<sup>∗</sup> ◦avD)c<sup>k</sup> *where* k : T → [0, 1] *is the constant function* 1 *defined only on terminal states.*

From this representation and Thm. 14 it is obvious that T is non-expansive.

**Lemma 16.** *Let* t: S → [0, 1]*. The approximation for* T *in the dual sense is* T t # : **<sup>2</sup>**[S] t <sup>→</sup> **<sup>2</sup>**[S] <sup>T</sup> (t) *with*

$$\mathcal{T}\_\#^t(S') = \{ s \in [S]^{\mathcal{T}(t)} \mid s \notin T \land \operatorname{supp}(\eta(s)) \subseteq S' \}.$$

It is well-known that the function T can be tweaked in such a way that it has a unique fixpoint, coinciding with μT , by determining all states which cannot reach a terminal state and setting their value to zero [3]. Hence fixpoint iteration from above does not bring us any added value here. It does however make sense to use the proof rule in order to guarantee lower bounds via post-fixpoints.

Furthermore, termination probability is a special case of the considerably more complex stochastic games that will be studied in §7, where the trick of modifying the function is not applicable.

#### **6.2 Behavioural Metrics for Probabilistic Automata**

Before we start discussing probabilistic automata, we first consider the Hausdorff and the Kantorovich lifting and the corresponding approximations.

*Hausdorff Lifting.* Given a metric on a set X, the Hausdorff metric is obtained by lifting the original metric to **2**<sup>X</sup>. Here we define this for general distance functions on M, not restricting to metrics. In particular the Hausdorff lifting is given by a function <sup>H</sup> : <sup>M</sup><sup>X</sup>×<sup>X</sup> <sup>→</sup> <sup>M</sup>**<sup>2</sup>**X×**2**<sup>X</sup> where

$$\mathcal{H}(d)(X\_1, X\_2) = \max\{\max\_{x\_1 \in X\_1} \min\_{x\_2 \in X\_2} d(x\_1, x\_2), \max\_{x\_2 \in X\_2} \min\_{x\_1 \in X\_1} d(x\_1, x\_2)\}.$$

An alternative characterisation due to M´emoli [20], also in [4], is more convenient for our purposes. If we let <sup>u</sup> : **<sup>2</sup>**<sup>X</sup>×<sup>X</sup> <sup>→</sup> **<sup>2</sup>**<sup>X</sup> <sup>×</sup> **<sup>2</sup>**<sup>X</sup> with <sup>u</sup>(C)=(π1[C], π2[C]), where π1, π<sup>2</sup> are the projections π<sup>i</sup> : X × X → X and πi[C] = {πi(c) | c ∈ C}. Then H(d)(X1, X2) = min{max(x1,x2)∈<sup>C</sup> d(x1, x2) | C ⊆ X × X ∧ u(C) = (X1, X2)}. Relying on this, we can obtain the result below, from which we deduce that H is non-expansive and construct its approximation as the composition of the corresponding functions from Table 1.

**Lemma 17.** <sup>H</sup> = min<sup>u</sup> ◦ max<sup>∈</sup> *where* max<sup>∈</sup> : <sup>M</sup><sup>X</sup>×<sup>X</sup> <sup>→</sup> <sup>M</sup>**<sup>2</sup>**X×<sup>X</sup> *(*∈ ⊆ (X×X)× **<sup>2</sup>**<sup>X</sup>×<sup>X</sup> *is the "is-element-of "-relation on* <sup>X</sup> <sup>×</sup> <sup>X</sup>*),* min<sup>u</sup> : <sup>M</sup>**<sup>2</sup>**X×<sup>X</sup> <sup>→</sup> <sup>M</sup>**<sup>2</sup>**X×**2**<sup>X</sup> *.*

*Kantorovich Lifting.* The Kantorovich (also known as Wasserstein) lifting converts a metric on X to a metric on probability distributions over X. As for the Hausdorff lifting, we lift distance functions that are not necessarily metrics.

Furthermore, in order to ensure finiteness of all the sets involved, we restrict to D ⊆ D(X), some finite set of probability distributions over X. A *coupling* of p, q ∈ D is a probability distribution c ∈ D(X × X) whose left and right marginals are p, q, i.e., p(x1) = m<sup>L</sup> <sup>c</sup> (x1) := - <sup>x</sup>2∈<sup>X</sup> <sup>c</sup>(x1, x2) and q(x2) = m<sup>R</sup> <sup>c</sup> (x2) := - <sup>x</sup>1∈<sup>X</sup> <sup>c</sup>(x1, x2). The set of all couplings of p, q, denoted by Ω(p, q), forms a polytope with finitely many vertices [24]. The set of all polytope vertices that are obtained by coupling any p, q ∈ D is also finite and is denoted by *VP*<sup>D</sup> ⊆ D(X × X).

The Kantorovich lifting is given by <sup>K</sup> : [0, 1]<sup>X</sup>×<sup>X</sup> <sup>→</sup> [0, 1]<sup>D</sup>×<sup>D</sup> where

$$\mathcal{K}(d)(p,q) = \min\_{c \in \mathcal{Q}(p,q)} \sum\_{(x\_1, x\_2) \in X \times X} c(x\_1, x\_2) \cdot d(x\_1, x\_2).$$

The coupling c can be interpreted as the optimal transport plan to move goods from suppliers to customers [30]. Again there is an alternative characterisation, which shows non-expansiveness of K:

**Lemma 18.** *Let* <sup>u</sup> : *VP*<sup>D</sup> <sup>→</sup> <sup>D</sup>×D*,* <sup>u</sup>(c)=(m<sup>L</sup> <sup>c</sup> , m<sup>R</sup> <sup>c</sup> )*. Then* K = min<sup>u</sup> ◦avVP<sup>D</sup> *, where* avVP<sup>D</sup> : [0, 1]<sup>X</sup>×<sup>X</sup> <sup>→</sup> [0, 1]VP<sup>D</sup> *,* min<sup>u</sup> : [0, 1]VP<sup>D</sup> <sup>→</sup> [0, 1]<sup>D</sup>×<sup>D</sup>*.*

*Probabilistic Automata.* We now compare our approach with [2], which describes the first method for computing behavioural distances for probabilistic automata. Although the behavioural distance arises as a least fixpoint, it is in fact better, even the only known method, to iterate from above, in order to reach this least fixpoint. This is done by guessing and improving couplings, similar to strategy iteration discussed later in §7. A major complication, faced in [2], is that the procedure can get stuck at a fixpoint which is not the least and one has to determine that this is the case and decrease the current candidate. In fact this paper was our inspiration to generalise this technique to a more general setting.

A *probabilistic automaton* is a tuple A = (S, L, η, ), where S is a non-empty finite set of states, <sup>L</sup> is a finite set of labels, <sup>η</sup> : <sup>S</sup> <sup>→</sup> **<sup>2</sup>**D(S) assigns finite sets of probability distributions to states and : S → L is a labelling function. (In the following we again replace D(S) by a finite subset D.)

The *probabilistic bisimilarity pseudometrics* is the least fixpoint of the function <sup>M</sup>: [0, 1]<sup>S</sup>×<sup>S</sup> <sup>→</sup> [0, 1]<sup>S</sup>×<sup>S</sup> where for <sup>d</sup>: <sup>S</sup> <sup>×</sup> <sup>S</sup> <sup>→</sup> [0, 1], s, t <sup>∈</sup> <sup>S</sup>:

$$\mathcal{M}(d)(s,t) = \begin{cases} 1 & \text{if } \ell(s) \neq \ell(t) \\ \mathcal{H}(\mathcal{K}(d))(\eta(s), \eta(t)) & \text{otherwise} \end{cases}$$

where <sup>H</sup> is the Hausdorff lifting (for <sup>M</sup> = [0, 1]) and <sup>K</sup> is the Kantorovich lifting defined earlier. Now assume that d is a fixpoint of M, i.e., d = M(d). In order to check whether d = μf, [2] adapts the notion of a self-closed relation from [16]. **Definition 19 ([2]).** *A relation* M ⊆ S × S *is* self-closed *wrt.* d = M(d) *if, whenever* sM t*, then*


The largest self-closed relation, denoted by ≈<sup>d</sup> is empty if and only if d = μf [2]. We now investigate the relation between self-closed relations and postfixpoints of approximations. For this we will first show that M can be composed from non-expansive functions, which proves that it is indeed non-expansive. Furthermore, this decomposition will help in the comparison.

**Lemma 20.** *The fixpoint function* M *characterizing probabilistic bisimilarity pseudometrics can be written as:*

$$\mathcal{M} = \max\_{\rho} \circ (((\eta \times \eta)^\* \circ \mathcal{H} \circ \mathcal{K}) \uplus c\_l),$$

*where* ρ: (S × S) (S × S) → (S × S) *with* ρ((s, t), i)=(s, t)*.* <sup>4</sup> *Furthermore* l: S×S → [0, 1] *is defined as* l(s, t)=0 *if* (s) = (t) *and* l(s, t)=1 *if* (s) = (t)*.*

Hence M is a composition of non-expansive functions and thus non-expansive itself. We do not spell out <sup>M</sup><sup>d</sup> # explicitly, but instead show how it is related to self-closed relations.

**Proposition 21.** *Let* <sup>d</sup>: <sup>S</sup> <sup>×</sup><sup>S</sup> <sup>→</sup> [0, 1] *where* <sup>d</sup> <sup>=</sup> <sup>M</sup>(d)*. Then* <sup>M</sup><sup>d</sup> # : **<sup>2</sup>**[S×S] d → **2**[S×S] d *, where* [S × S] <sup>d</sup> <sup>=</sup> {(s, t) <sup>∈</sup> <sup>S</sup> <sup>×</sup> <sup>S</sup> <sup>|</sup> <sup>d</sup>(s, t) <sup>&</sup>gt; <sup>0</sup>}*.*

*Then* M *is a self-closed relation wrt.* d *if and only if* M ⊆ [S × S] <sup>d</sup> *and* M *is a post-fixpoint of* <sup>M</sup><sup>d</sup> #*.*

#### **6.3 Bisimilarity**

In order to define standard bisimilarity we use a variant G of the Hausdorff lifting H from §6.2 where max and min are swapped and which we denote by G.

Now we can define the fixpoint function for bisimilarity and its corresponding approximation. For simplicity we consider unlabelled transition systems, but it would be straightforward to handle labelled transitions.

Let <sup>X</sup> be a finite set of states and <sup>η</sup> : <sup>X</sup> <sup>→</sup> **<sup>2</sup>**<sup>X</sup> a function that assigns a set of successors η(x) to a state x ∈ X. For the fixpoint function for bisimilarity <sup>B</sup> : {0, <sup>1</sup>}<sup>X</sup>×<sup>X</sup> → {0, <sup>1</sup>}<sup>X</sup>×<sup>X</sup> we use the Hausdorff lifting <sup>G</sup> with <sup>M</sup> <sup>=</sup> {0, <sup>1</sup>}.

**Lemma 22.** *Bisimilarity on* η *is the greatest fixpoint of* B = (η × η)<sup>∗</sup> ◦ G*.*

<sup>4</sup> Here we use <sup>i</sup> ∈ {0, <sup>1</sup>} as indices to distinguish the elements in the disjoint union.

Since we are interested in the greatest fixpoint, we are working in the primal sense. Bisimulation relations are represented by their characteristic functions d: X × X → {0, 1}, in fact the corresponding relation can be obtained by taking the complement of [X × X]<sup>d</sup> = {(x1, x2) ∈ X<sup>1</sup> × X<sup>2</sup> | d(x1, x2)=0}.

**Lemma 23.** *Let* d: X × X → {0, 1}*. The approximation for the bisimilarity function* <sup>B</sup> *in the primal sense is* <sup>B</sup># <sup>d</sup> : **<sup>2</sup>**[X×X]<sup>d</sup> <sup>→</sup> **<sup>2</sup>**[X×X]B(d) *with*

$$\begin{aligned} \mathcal{B}\_d^\#(R) &= \{ (x\_1, x\_2) \in [X \times X]\_{\mathcal{B}(d)} \mid \\ &\forall y\_1 \in \eta(x\_1) \exists y\_2 \in \eta(x\_2) \Big( (y\_1, y\_2) \notin [X \times X]\_d \lor (y\_1, y\_2) \in R \Big) \}, \\ &\land \forall y\_2 \in \eta(x\_2) \exists y\_1 \in \eta(x\_1) \Big( (y\_1, y\_2) \notin [X \times X]\_d \lor (y\_1, y\_2) \in R \Big) \} \end{aligned}$$

We conclude this section by discussing how this view on bisimilarity can be useful: first, it again opens up the possibility to compute bisimilarity – a greatest fixpoint – by iterating from below, through smaller fixpoints. This could potentially be useful if it is easy to compute the least fixpoint of B inductively and continue from there.

Furthermore, we obtain a technique for witnessing non-bisimilarity of states. While this can also be done by exhibiting a distinguishing modal formula [17,9] or by a winning strategy for the spoiler in the bisimulation game [27], to our knowledge there is no known method that does this directly, based on the definition of bisimilarity.

With our technique however, we can witness non-bisimilarity of two states x1, x<sup>2</sup> ∈ X by presenting a pre-fixpoint d (i.e., B(d) ≤ d) such that d(x1, x2)=0 (equivalent to (x1, x2) <sup>∈</sup> [<sup>X</sup> <sup>×</sup> <sup>X</sup>]d) and <sup>ν</sup>B# <sup>d</sup> = ∅, since this implies νB(x1, x2) ≤ d(x1, x2) = 0 by our proof rule.

There are two issues to discuss: first, how can we characterise a pre-fixpoint of B (which is quite unusual, since bisimulations are post-fixpoints)? In fact, the condition B(d) ≤ d can be rewritten to: for all (x1, x2) ∈ [X × X]<sup>d</sup> there exists y<sup>1</sup> ∈ η(x1) such that for all y<sup>2</sup> ∈ η(x2) we have (y1, y2) ∈ [X × X]<sup>d</sup> (*or* vice versa). Second, at first sight it does not seem as if we gained anything since we still have to do a fixpoint computation on relations. However, the carrier set is [X × X]d, i.e., a set of non-bisimilarity witnesses and this set can be small even though X might be large.

*Example 24.* We consider the transition system depicted below.

Our aim is to construct a witness showing that x, u are not bisimilar. This witness is a function d: X × X → {0, 1} with d(x, u)=0= d(y, u) and for all other pairs the value is 1. <sup>x</sup> <sup>y</sup> <sup>u</sup>

Hence [X × X]<sup>d</sup>=B(d) = [X × X]<sup>d</sup> = {(x, u),(y, u)} and it is easy to check that d is a pre-fixpoint of B and that νB<sup>∗</sup> <sup>d</sup> = ∅: we iterate over {(x, u),(y, u)} and first remove (y, u) (since y has no successors) and then (x, u). This implies that νB ≤ d and hence νB(x, u) = 0, which means that x, u are not bisimilar.

*Example 25.* We modify Ex. 24 and consider a function d where d(x, u)=0 and all other values are 1. Again d is a pre-fixpoint of B and νB ≤ d (since only reflexive pairs are in the bisimilarity). However νB<sup>∗</sup> <sup>d</sup> = ∅, since {(x, u)} is a post-fixpoint. This is a counterexample to completeness discussed after Thm. 13.

Intuively speaking, the states y, u over-approximate and claim that they are bisimilar, although they are not. (This is permissible for a pre-fixpoint.) This tricks x, u into thinking that there is some wiggle room and that one can increase the value of (x, u). This is true, but only because of the limited, local view, since the "true" value of (y, u) is 0.

#### **7 Simple Stochastic Games**

*Introduction to Simple Stochastic Games.* In this section we show how our techniques can be applied to simple stochastic games [11,10]. A simple stochastic game is a state-based two-player game where the two players, Min and Max, each own a subset of states they control, for which they can choose the successor. The system also contains sink states with an assigned payoff and averaging states which randomly choose their successor based on a given probability distribution. The goal of Min is to minimise and the goal of Max to maximise the payoff.

Simple stochastic games are an important type of games that subsume parity games and the computation of behavioural distances for probabilistic automata (cf. §6.2, [2]). The associated decision problem is known to lie in NP∩coNP, but it is an open question whether it is contained in P. There are known randomised subexponential algorithms [7].

It has been shown that it is sufficient to consider positional strategies, i.e., strategies where the choice of the player is only dependent on the current state. The expected payoffs for each state form a so-called value vector and can be obtained as the least solution of a fixpoint equation (see below).

A *simple stochastic game* is given by a finite set V of nodes, partitioned into *MIN* , *MAX* , *AV* (average) and *SINK*, and the following data: <sup>η</sup>min : *MIN* <sup>→</sup> **<sup>2</sup>**<sup>V</sup> , <sup>η</sup>max : *MAX* <sup>→</sup> **<sup>2</sup>**<sup>V</sup> (successor functions for Min and Max nodes), <sup>η</sup>av : *AV* <sup>→</sup> <sup>D</sup> (probability distributions, where D ⊆ D(V ) finite) and w : *SINK* → [0, 1] (weights of sink nodes).

The fixpoint function <sup>V</sup> : [0, 1]<sup>V</sup> <sup>→</sup> [0, 1]<sup>V</sup> is defined below for <sup>a</sup>: <sup>V</sup> <sup>→</sup> [0, 1] and v ∈ V :

$$\mathcal{V}(a)(v) = \begin{cases} \min\_{v' \in \eta\_{\text{min}}(v)} a(v') & v \in MIN \\ \max\_{v' \in \eta\_{\text{max}}(v)} a(v') & v \in MAX \\ \sum\_{v' \in V} \eta\_{\text{av}}(v)(v') \cdot a(v') & v \in AV \\ w(v) & v \in SIN \end{cases}$$

The *least* fixpoint of V specifies the average payoff for all nodes when Min and Max play optimally. In an infinite game the payoff is 0. In order to avoid infinite games and guarantee uniqueness of the fixpoint, many authors [18,10,29] restrict

to stopping games, which are guaranteed to terminate for every pair of Min/Maxstrategies. Here we deal with general games where more than one fixpoint may exist. Such a scenario has been studied in [19], which considers value iteration to under- and over-approximate the value vector. The over-approximation faces challenges with cyclic dependencies, similar to the vicious cycles described earlier. Here we focus on strategy iteration, which is usually less efficient than value iteration, but yields a precise result instead of approximating it.

*Example 26.* We consider the game depicted below. Here min is a Min node with ηmin(min) = {**1**, av}, max is a Max node with ηmax(max) = {*ε*, av}, **1** is a sink node with payoff 1, *ε* is a sink node with some small payoff ε ∈ (0, 1) and av is an average node which transitions to both min and max with probability <sup>1</sup> 2 .

Min should choose av as successor since a payoff of 1 is bad for Min. Given this choice of Min, Max should not declare av as successor since this would create an infinite play and hence the payoff is 0. Therefore Max has to choose *ε* and be content with a payoff of ε, which is achieved from all nodes different from **1**.

In order to be able to determine the approximation of V and to apply our techniques, we consider the following equivalent definition.

**Lemma 27.** V = (η<sup>∗</sup> min ◦ min∈) (η<sup>∗</sup> max ◦ max∈) (η<sup>∗</sup> av ◦ avD) cw*, where* ∈ ⊆ <sup>V</sup> <sup>×</sup> **<sup>2</sup>**<sup>V</sup> *is the "is-element-of "-relation on* <sup>V</sup> *.*

As a composition of non-expansive functions, V is non-expansive as well. Since we are interested in the least fixpoint we work in the dual sense and obtain the following approximation, which intuitively says: we can decrease a value at node v by a constant only if, in the case of a Min node, we decrease the value of one successor where the minimum is reached, in the case of a Max node, we decrease the values of all successors where the maximum is reached, and in the case of an average node, we decrease the values of all successors.

**Lemma 28.** *Let* a: V → [0, 1]*. The approximation for the value iteration function* <sup>V</sup> *in the dual sense is* <sup>V</sup><sup>a</sup> # : **<sup>2</sup>**[<sup>V</sup> ] a <sup>→</sup> **<sup>2</sup>**[<sup>V</sup> ] <sup>V</sup>(a) *with*

$$\begin{aligned} \mathcal{V}\_{\#}^{a}(V') &= \{ v \in [V]^{\mathcal{V}(a)} \mid \left( v \in MIN \land Min\_{a\_{|\eta\_{\min}(v)}} \cap V' \neq \emptyset \right) \lor \\ &\quad \left( v \in MAX \land Max\_{a\_{|\eta\_{\max}(v)}} \subseteq V' \right) \lor \left( v \in AV \land supp(\eta\_{\text{av}}(v)) \subseteq V' \right) \} \end{aligned}$$

*Strategy Iteration from Above and Below.* We describe two algorithms based on strategy iteration, first introduced by Hoffman and Karp in [18], that are novel, as far as we know. The first iterates to the least fixpoint from above and uses the techniques described in §4. The second iterates from below: the role of our results is not directly visible in the code of the algorithm, but its non-trivial correctness proof is based on the proof rule introduced earlier.


(a) Strategy iteration from above

1. Guess a Max-strategy σ(0),


(b) Strategy iteration from below

Fig. 2: Strategy iteration from above and below

We first recap the underlying notions: a Min-strategy is a mapping τ : *MIN* → V such that τ (v) ∈ ηmin(v) for every v ∈ *MIN* . With such a strategy, Min decides to always leave a node v via τ (v). Analogously σ : *MAX* → V fixes a Max-strategy. Fixing a strategy for either player induces a modified value function. If τ is a Min-strategy, we obtain V<sup>τ</sup> which is defined exactly as V but for v ∈ *MIN* where we set V<sup>τ</sup> (a)(v) = a(τ (v)). Analogously, for σ a Max-strategy, V<sup>σ</sup> is obtained by setting V<sup>σ</sup>(a)(v) = a(σ(v)) when v ∈ *MAX* . If both players fix their strategies, the game reduces to a Markov chain.

In order to describe our algorithms we also need the notion of a *switch*. Assume that τ is a Min-strategy and let a be a (pre-)fixpoint of V<sup>τ</sup> . Min can now potentially improve her strategy for nodes v ∈ *MIN* where min<sup>v</sup>-<sup>∈</sup>ηmin(v) a(v ) < a(τ (v)), called *switch nodes*. This results in a Min-strategy τ = *sw*min(τ, a), where<sup>5</sup> τ (v) = arg min<sup>v</sup>-<sup>∈</sup>ηmin(v) <sup>a</sup>(i)(v ) for a switch node v and τ , τ agree otherwise. Also, *sw*max(σ, a) is defined analogously for Max strategies.

Now strategy iteration from above works as described in Figure 2a. The computation of μV<sup>τ</sup> (i) in the second step intuitively means that Max chooses his best answering strategy and we compute the least fixpoint based on this answering strategy. At some point no further switches are possible and we have reached a fixpoint a, which need not yet be the least fixpoint. Hence we use the techniques from §4 to decrease a and obtain a new pre-fixpoint a(i+1), from which we can continue. The correctness of this procedure partially follows from Thm. 11 and Lem. 12, however we also need to show the following: first, we can compute <sup>a</sup>(i) <sup>=</sup> <sup>μ</sup>V<sup>τ</sup> (i) efficiently by solving a linear program (cf. Lem. 29) by adapting [11]. Second, the chain of the a(i) decreases, which means that the algorithm will eventually terminate (cf. Thm. 30).

<sup>5</sup> If the minimum is achieved in several nodes, Min simply chooses one of them. However, she will only switch if this strictly improves the value.

Strategy iteration from below is given in Figure 2b. At first sight, the algorithm looks simpler than strategy iteration from above, since we do not have to check whether we have already reached νV, reduce and continue from there. However, in this case the computation of μV<sup>σ</sup>(i) via a linear program is more involved (cf. Lem. 29), since we have to pre-compute (via greatest fixpoint iteration over **2**<sup>V</sup> ) the nodes where Min can force a cycle based on the current strategy of Max, thus obtaining payoff 0.

This algorithm does not directly use our technique but we can use our proof rules to prove the correctness of the algorithm (Thm. 30). In particular, the proof that the sequence a(i) increases is quite involved: we have to show that <sup>a</sup>(i) <sup>=</sup> <sup>μ</sup>V<sup>σ</sup>(i) <sup>≤</sup> <sup>μ</sup>V<sup>σ</sup>(i+1) <sup>=</sup> <sup>a</sup>(i+1). We prove this, using our proof rules, by showing that <sup>a</sup>(i) is below the least fixpoint of <sup>V</sup><sup>σ</sup>(i+1) .

The algorithm generalises strategy iteration by Hoffman and Karp [18]. Note that we cannot simply adapt their proof, since we do not assume that the game is stopping, which is a crucial ingredient.

**Lemma 29.** *The least fixpoints of* V<sup>τ</sup> *and* V<sup>σ</sup> *can be determined by solving linear programs.*

**Theorem 30.** *Strategy iteration from above and below both terminate and compute the least fixpoint of* V*.*

*Example 31.* Ex. 26 is well suited to explain our two algorithms.

Starting with strategy iteration from above, we may guess τ (0)(min) = **1**. In this case, Max would choose av as successor and we would reach a fixpoint, where each node except for *ε* is associated with a payoff of 1. Next, our algorithm would detect the vicious cycle formed by min, av and max. We can reduce the values in this vicious cycle and reach the correct payoff values for each node.

For strategy iteration from below assume that σ(0)(max) = av. Given this strategy of Max, Min can force the play to stay in a cycle formed by min, av and max. Thus, the payoff achieved by the Max strategy σ(0) and an optimal play by Min would be 0 for each of these nodes. In the next iteration Max switches and chooses *ε* as successor, i.e. σ(1)(max) = *ε*, which results in the correct values.

We implemented strategy iteration from above and below and classical Kleene iteration in MATLAB. In Kleene iteration we terminate with a tolerance of 10−<sup>14</sup>, i.e., we stop if the change from one iteration to the next is below this bound. We tested the algorithms on random stochastic games and found that Kleene iteration is always the fastest, but only converges and it is known that the rate of convergence can be exponentially slow [10]. Strategy iteration from below is usually slightly faster than strategy iteration from above. More details can be found in the full version [5].

### **8 Conclusion**

It is well-known that several computations in the context of system verification can be performed by various forms of fixpoint iteration and it is worthwhile to study such methods at a high level of abstraction, typically in the setting of complete lattices and monotone functions. Going beyond the classical results by Tarski [28], combination of fixpoint iteration with approximations [14,6] and with up-to techniques [25] has proven to be successful. Here we treated a more specific setting, where the carrier set consists of functions from a finite set into an MV-chain and the fixpoint functions are non-expansive (and hence monotone), and introduced a novel technique to obtain upper bounds for greatest and lower bounds for least fixpoints, including associated algorithms. Such techniques are widely applicable to a wide range of examples and so far they have been studied only in quite specific scenarios, such as in [2,16,19].

In the future we plan to lift some of the restrictions of our approach. First, an extension to an infinite domain Y would of course be desirable, but since several of our results currently depend on finiteness, such a generalisation does not seem to be easy. Another restriction, to total orders, seems easier to lift: in particular, if the partially ordered MV-algebra M¯ is of the form M<sup>I</sup> where I is a finite index set and M an MV-chain. (E.g., finite Boolean algebras are of this type.) Then our function space is M¯ <sup>Y</sup> = M<sup>I</sup> <sup>Y</sup> ∼= M<sup>Y</sup> <sup>×</sup><sup>I</sup> and we have reduced to the setting presented in this paper. This will allow us to handle featured transition systems [12] where transitions are equipped with boolean formulas. We also plan to determine the largest possible increase that can be added to a fixpoint that is not yet the greatest fixpoint in order to maximally speed up fixpoint iteration from below (this might be larger than ι f <sup>a</sup>).

There are several other application examples that did not fit into this paper, but that can also be handled by our approach: for instance behavioural distances for metric transition systems [15] and other types of systems [4]. We also plan to investigate other types of games, such as energy games [8]. While here we introduced strategy iteration techniques for simple stochastic games, we also want to check whether we can provide an improvement to value iteration techniques, combining our approach with [19].

We also plan to study whether some examples can be handled with other types of Galois connections: here we used an additive variant, but looking at multiplicative variants (multiplication by a constant factor) might also be fruitful.

*Acknowledgements:* We are grateful to Ichiro Hasuo for making us aware of stochastic games as application domain. Furthermore we would like to thank Matthias Kuntz and Timo Matt for their help with experiments.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/ 4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **"Most of" leads to undecidability: Failure of adding frequencies to LTL**

Bartosz Bednarczyk-<sup>1</sup>*,*<sup>2</sup> and Jakub Michaliszyn<sup>2</sup>

<sup>1</sup> Computational Logic Group, Technische Universität Dresden, Dresden, Germany <sup>2</sup> Institute of Computer Science, University of Wrocław, Wrocław, Poland {bartosz.bednarczyk, jakub.michaliszyn}@cs.uni.wroc.pl

**Abstract.** Linear Temporal Logic (LTL) interpreted on finite traces is a robust specification framework popular in formal verification. However, despite the high interest in the logic in recent years, the topic of their quantitative extensions is not yet fully explored. The main goal of this work is to study the effect of adding weak forms of percentage constraints (*e*.*g*. that *most of* the positions in the past satisfy a given condition, or that *σ* is the *most-frequent* letter occurring in the past) to fragments of LTL. Such extensions could potentially be used for the verification of influence networks or statistical reasoning. Unfortunately, as we prove in the paper, it turns out that percentage extensions of even tiny fragments of LTL have undecidable satisfiability and model-checking problems. Our undecidability proofs not only sharpen most of the undecidability results on logics with arithmetics interpreted on words known from the literature, but also are fairly simple. We also show that the undecidability can be avoided by restricting the allowed usage of the negation, and discuss how the undecidability results transfer to first-order logic on words.

### **1 Introduction**

Linear Temporal Logic [29] (LTL) interpreted on finite traces is a robust logical framework used in formal verification [1,18,19]. However, LTL is not perfect: it can express whether some event happens or not, but it cannot provide any insight on how frequently such an event occurs or for how long such an event took place. In many practical applications, such *quantitative* information is important: think of optimising a server based on how frequently it receives messages or optimising energy consumption knowing for how long a system is usually used in rush hours. Nevertheless, there is a solution: one can achieve such goals by adding quantitative features to LTL.

It is known that adding quantitative operators to LTL often leads to undecidability. The proofs, however, typically involve operators such as "next" or "until", and are often quite complicated (see the discussion on the related work below). In this work, we study the logic LTL**F**, a fragment of LTL where the only allowed temporal operator is "sometimes in the future" **F** . We extend its language with two types of operators, sharing a similar "percentage" flavour: with the *Past-Majority* **PM***ϕ* operator (stating that most of the past positions satisfy

<sup>©</sup> The Author(s) 2021 S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 82–101, 2021. https://doi.org/10.1007/978-3-030-71995-1\_5

a formula *ϕ*), and with the *Most-Frequent-Letter* **MFL** *σ* predicates (meaning that the letter *σ* is among the most frequent letters appearing in the past). These operators can be used to express a number of interesting properties, such as *if a process failed to enter the critical section, then the other process was in the critical section the majority of time*. Of course, for practical applications, we could also consider richer languages, such as parametrised versions of these operators, *e*.*g*. stating that *at least a fraction p of positions in the past satisfies a formula*. However, we show, as our main result, that even these very simple percentage operators raise undecidability when combined with **F** .

To make the undecidability proof for both operators similar, we define an intermediate operator, **Half** , which is satisfied when exactly half of the past positions satisfy a given formula. The **Half** operator can be expressed easily with **PM**, but not with **MFL** — we show, however, that we can simulate it to an extent enough to show the undecidability. Our proof method relies on enforcing a model to be in the language ({*wht*}{*shdw*})<sup>+</sup>, for some letters *wht* and *shdw*, which a priori seems to be impossible without the "next" operator. Then, thanks to the specific shape of the models, we show that one can "transfer" the truth of certain formulae from positions into their successors, hence the "next" operator can be partially expressed. With a combination of these two ideas, we show that it is possible to write equicardinality statements in the logic. Finally, we perform a reduction from the reachability problem of Two-counter Machines [26]. In the reduction, the equicardinality statements will be responsible for handling zerotests. The idea of transferring predicates from each position into its successor will be used for switching the machine into its next configuration.

The presented undecidability proof of LTL with percentage operators can be adjusted to extensions of fragments of first-order logic on finite words. We show that FO<sup>2</sup> <sup>M</sup>[*<*], *i*.*e*. the two-variable fragment of first-order logic admitting the majority quantifier M and linear order predicate *<* has an undecidable satisfiability problem. Here the meaning of a formula M*x.ϕ*(*x, y*) is that at least a half of possible interpretations of *x* satisfies *ϕ*(*x, y*). Our result sharpens an existing undecidability proof for (full) FO with Majority from [23] (since in our case the number of variables is limited) but also FO<sup>2</sup>[*<, succ*] with arithmetics from [25] (since our counting mechanism is weaker and the successor relation *succ* is disallowed). On the positive side, we show that the undecidability heavily depends on the presence of the negation in front of the percentage operators. To do so, we introduce a logic, extending the full LTL, in which the usage of percentage operators is possible, but suitably restricted. For this logic, we show that the satisfiability problem is decidable.

All the above-mentioned results can be easily extended to the model checking problem, where the question is whether a given Kripke structure satisfies a given formula. The full version of the paper is available on arXiv [4].

#### **1.1 Related work**

The first paper studying the addition of quantitative features to logic was [21], where the authors proved undecidability of Weak MSO with Cardinalities. They

also developed a model of so-called Parikh Automaton, a finite automaton imposing a semi-linear constraint on the set of its final configurations. Such an automaton was successfully used to decide logics with counting as well as logics on data words [27,17]. Its expressiveness was studied in [11].

Another idea in the realm of quantitative features is availability languages [20], which extend regular expressions by numerical occurrence constraints on the letters. However, their high expressivity leads to undecidable emptiness problems. Weak forms of arithmetics have also attracted interest from researchers working on temporal logics. Several extensions of LTL were studied, including extensions with counting [24], periodicity constraints [14], accumulative values [7], discounting [2], averaging [9] and frequency constraints [8]. A lot of work was done to understand LTL with timed constraints, *e*.*g*. a metric LTL was considered in [28]. However, its complexity is high and its extensions are undecidable [3].

Arithmetical constraints can also be added to the First-Order logic (FO) on words via so-called counting quantifiers. It is known that weak MSO on words is decidable with threshold counting and modulo-counting (thanks to the famous Büchi theorem [10]), while even FO on words with percentage quantifiers becomes undecidable [23]. Extensions of fragments of FO on words are often decidable, *e*.*g*. the two-variable fragment FO<sup>2</sup> with counting [12] or FO<sup>2</sup> with modulo-counting [25]. The investigation of decidable extensions of FO<sup>2</sup> is limited by the undecidability of FO<sup>2</sup> on words with Presburger constraints [25].

Among the above-mentioned logics, the formalisms of this paper are most similar to Frequency LTL [8]. The satisfiability problem for Frequency LTL was claimed to be undecidable, but the undecidability proof as presented in [8] is bugged (see [9, Sec. 8] for discussion). It was mentioned in [9] that the undecidability proof from [8] can be patched, but no correction was published so far. Our paper not only provides a valid proof but also sharpens the result, as we use a way less expressive language (*e*.*g*. we are allowed to use neither the "until" operator nor the "next" operator). We also believe that our proof is simpler. The second-closest formalism to ours is average-LTL [9]. The main difference is that the averages of average-LTL are computed based on the future, while in our paper, the averages are based on the past. The second difference, as in the previous case, is that their undecidability proof uses more expressive operators, such as the "until" operator.

### **2 Preliminaries**

We recall definitions concerning logics on words and temporal logics (*cf*. [15]).

*Words and logics.* Let AP be a countably-infinite set of *atomic propositions*, called here also *letters*. A finite *word* <sup>w</sup> <sup>∈</sup> (2AP)<sup>∗</sup> is a non-empty finite sequence of *positions* labelled with sets of letters from AP. A set of words is called a *language*. Given a word w, we denote its *i*-th position with w*<sup>i</sup>* (where the first position is w0) and its prefix up to the *i*-th position with w≤*<sup>i</sup>*. We usually use the letters *p, q, i, j* to denote positions. With |w| we denote the length of w.

The syntax of LTL**F**, a fragment of LTL with only the *finally* operator **F** , is defined with the grammar: *ϕ, ϕ*- ::= *a* (with *a* ∈ AP) | ¬*ϕ* | *ϕ* ∧ *ϕ*-<sup>|</sup> **<sup>F</sup>** *<sup>ϕ</sup>*.

The satisfaction relation |= is defined for words as follows:

$$\begin{array}{ll} \mathfrak{w},i \vDash a & \text{if } a \in \mathfrak{w}\_{i} \\ \mathfrak{w},i \vDash \neg\varphi & \text{if not} & \mathfrak{w},i \vDash \varphi \\ \mathfrak{w},i \vDash \varphi\_{1} \land \varphi\_{2} \text{ if } \mathfrak{w},i \vDash \varphi\_{1} \text{ and } \mathfrak{w},i \vDash \varphi\_{2} \\ \mathfrak{w},i \vDash \mathbf{F}\varphi & \text{if } \exists j \text{ such that } \|\mathfrak{w}\| > j \ge i \text{ and } \mathfrak{w},j \vDash \varphi. \end{array}$$

We write w |= *ϕ* if w*,* 0 |= *ϕ*. The usual Boolean connectives: *,* ⊥*,*∨*,*→*,*↔ can be defined, hence we will use them as abbreviations. Additionally, we use the *globally* operator **<sup>G</sup>** *<sup>ϕ</sup>* := <sup>¬</sup>**<sup>F</sup>** <sup>¬</sup>*<sup>ϕ</sup>* to speak about events happening globally in the future.

*Percentage extension.* In our investigation, *percentage operators* **PM**, **MFL** and **Half** are added to LTL**F**.

The operator **PM***ϕ* (read as: *majority in the past*) is satisfied if at least half of the positions in the past satisfy *ϕ*:

$$\mathfrak{w}, i \models \mathbf{PM}\varphi \text{ if } |\{j < i \colon \mathfrak{w}, j \models \varphi\}| \ge \frac{i}{2}$$

For example, the formula **<sup>G</sup>** (*<sup>r</sup>* ↔ ¬*g*) <sup>∧</sup> **<sup>G</sup> PM***<sup>r</sup>* <sup>∧</sup> **<sup>G</sup> <sup>F</sup>** (*<sup>g</sup>* <sup>∧</sup> **PM***g*) is true over words where each *request r* is eventually fulfilled by a *grant g*, and where each grant corresponds to at least one request. This can be also seen as the language of balanced parentheses, showing that with the operator **PM** one can define properties that are not regular.

The operator **MFL** *<sup>σ</sup>* (read as: *most-frequent letter in the past*), for *<sup>σ</sup>* <sup>∈</sup> AP, is satisfied if *σ* is among the letters with the highest number of appearances in the past, *i*.*e*.

$$\mathfrak{w}, i \vdash \mathbf{MFL} \,\sigma \text{ if } \forall \tau \in \mathsf{AP}. \, |\{j < i \colon \mathfrak{w}, j \vdash \sigma\}| \ge |\{j < i \colon \mathfrak{w}, j \vdash \tau\}|$$

For example, the formula **<sup>G</sup>** <sup>¬</sup>(*<sup>r</sup>* <sup>∧</sup> *<sup>g</sup>*) <sup>∧</sup> **<sup>G</sup> MFL** *<sup>r</sup>* <sup>∧</sup> **<sup>G</sup> <sup>F</sup>** (*<sup>g</sup>* <sup>∧</sup> **MFL** *<sup>g</sup>*) again defines words where each request is eventually fulfilled, but this time the formula allows for states where nothing happens (*i*.*e*. when both *r* and *g* are false).

The last operator, **Half** is used to simplify the forthcoming undecidability proofs. This operator can be satisfied only at even positions, and its intended meaning is *exactly half of the past positions satisfy a given formula*.

$$\mathfrak{w}, i \vdash \mathbf{Half } \varphi \text{ if } |\{j < i \colon \mathfrak{w}, j \vdash \varphi\}| = \frac{i}{2}$$

It is not difficult to see that the operator **Half** *ϕ* can be defined in terms of the past-majority operator as **PM**(*ϕ*) <sup>∧</sup> **PM**(¬*ϕ*) and that **Half** *<sup>ϕ</sup>* can be satisfied only at even positions.

In the next sections, we distinguish different logics by enumerating the allowed operators in the subscripts, *e*.*g*. LTL**F***,***PM** or LTL**F***,***MFL**.

*Computational problems Kripke structures* are commonly used in verification to formalise abstract models. A Kripke structure is composed of a finite set *S* of *states*, a set of *initial* states *I* ⊆ *S*, a total *transition* relation *R* ⊆ *S* × *S*, and a finite *labelling function* : *<sup>S</sup>* <sup>→</sup> <sup>2</sup>AP. A *trace* of a Kripke structure is a finite word (*s*0)*,* (*s*1)*,...,*(*sk*) for any *s*0*, s*1*,...,s<sup>k</sup>* satisfying *s*<sup>0</sup> ∈ *I* and (*si, s<sup>i</sup>*+1) ∈ *R* for all *i<k*.

The *model-checking problem* amounts to checking whether *some* trace of a given Kripke structure satisfies a given formula *ϕ*. In the *satisfiability problem*, or simply in *SAT*, we check whether an input formula *ϕ* has a *model*, *i*.*e*. a finite word w witnessing w |= *ϕ*.

### **3 Playing with Half Operator**

Before we jump into the encoding of Minsky machines, we present some exercises to help the reader understand the expressive power of the logic LTL**F***,***Half**. The tools established in the exercises play a vital role in the undecidability proofs provided in the following section.

We start from the definition of shadowy words.

**Definition 1.** *Let wht and shdw be fixed distinct atomic propositions from* AP*. A word* w *is* shadowy *if its length is even, all even positions of* w *are labelled with wht, all odd positions of* w *are labelled with shdw, and no position is labelled with both letters.*

We will call the positions satisfying *wht* simply *white* and their successors satisfying *shdw* simply their *shadows*.

The following exercise is simple in LTL, but becomes much more challenging without the **X** operator.

*Exercise 1.* There is an LTL**F***,***Half** formula *ψshadowy* defining shadowy words.

*Solution.* We start with the "base" formula *ϕex*<sup>1</sup> *init* := *wht* <sup>∧</sup> **<sup>G</sup>** (*wht* ↔ ¬*shdw*) <sup>∧</sup> **<sup>G</sup>** (*wht* <sup>→</sup> **<sup>F</sup>** *shdw*), which states that the position <sup>0</sup> is labelled with *wht*, each position is labelled with exactly one letter among *wht, shdw* and that every white eventually sees a shadow in the future. What remains to be done is to ensure that only odd positions are shadows and that only even positions are white.

In order to do that, we employ the formula *ϕex*<sup>1</sup> *odd* := **<sup>G</sup>** ((**Half** *wht*) <sup>↔</sup> *wht*). Since **Half** is never satisfied at odd positions, the formula *ϕex*<sup>1</sup> *odd* stipulates that odd positions are labelled with *shdw*. An inductive argument shows that all the even positions are labelled with *wht*: for the position 0, it follows from *ϕex*<sup>1</sup> *init*. For an even position *p >* 0, assuming (inductively) that all even positions are labelled with *wht*, the formula *ϕex*<sup>1</sup> *odd* ensures that *p* is labelled with *wht*.

Putting it all together, the formula *ψshadowy* := *ϕex*<sup>1</sup> *init* <sup>∧</sup>*ϕex*<sup>1</sup> *odd* is as required.

In the next exercise, we show that it is possible to transfer the presence of certain letters from white positions into their shadows. It justifies the usage of "shadows" in the paper.

We introduce the so-called *counting terms*. For a formula *ϕ*, word w and a position *p*, by #*<sup>&</sup>lt; <sup>ϕ</sup>* (w*, p*) we denote the total number of positions among 0*,...,p*−1 satisfying *ϕ*, *i*.*e*. the size of {*p*- *< p* | w*, p*- |= *ϕ*}. We omit w in counting terms if it is known from the context.

*Exercise 2.* Let *<sup>σ</sup>* and *<sup>σ</sup>*˜ be distinct letters from AP \ {*wht, shdw*}. There is an LTL**F***,***Half** formula *ϕtrans <sup>σ</sup>σ*˜ , such that <sup>w</sup> <sup>|</sup><sup>=</sup> *<sup>ϕ</sup>trans <sup>σ</sup>σ*˜ iff:


*Solution.* Note that the first two conditions can be expressed with the conjunction of *<sup>ψ</sup>shadowy*, **<sup>G</sup>** (*<sup>σ</sup>* <sup>→</sup> *wht*) and **<sup>G</sup>** (˜*<sup>σ</sup>* <sup>→</sup> *shdw*). The last condition is more involving. Assuming that the words under consideration satisfy conditions 1–2, it is easy to see that the third condition is equivalent to expressing that all white positions *p* satisfy the equation (♥):

$$(\heartsuit):\ \#\_{wht\land\sigma}^{<}(\mathfrak{w},p)=\#\_{shdw\land\bar{\sigma}}^{<}(\mathfrak{w},p).$$

supplemented with the condition (♦), ensuring that the last white position satisfies the condition 3, *i*.*e*.

(♦) : for the last white position *p* we have: w*, p* |= *σ* ⇔ w*, p*+1 |= ˜*σ.*

The proof of the following lemma can be found in the appendix.

**Lemma 1.** *Let* w *be a word satisfying the conditions 1–2. Then* w *satisfies the condition 3 iff* w *satisfies* (♦) *and for all white positions p the equation* (♥) *holds.*

Going back to Exercise 2, we show how to define (♥) and (♦) in LTL**F***,***Half**, taking advantage of shadowness of the intended models. Take an arbitrary white position *p* of w. The equation (♥) for *p* is clearly equivalent to:

$$(\heartsuit^{\prime}):\ \#\_{wht\land\sigma}^{<}(\mathfrak{w},p)+\left(\frac{p}{2}-\#\_{shdw\land\bar{\sigma}}^{<}(\mathfrak{w},p)\right)=\frac{p}{2}$$

Since *p* is even, we infer that *<sup>p</sup>* <sup>2</sup> <sup>∈</sup> <sup>N</sup>. From the shadowness of <sup>w</sup>, we know that there are exactly *<sup>p</sup>* <sup>2</sup> shadows in the past of *p*. Moreover, each shadow satisfies either *<sup>σ</sup>*˜ or <sup>¬</sup>*σ*˜. Hence, the expression *<sup>p</sup>* <sup>2</sup> <sup>−</sup>#*<sup>&</sup>lt; shdw*∧*σ*˜(w*, p*) from (♥- ), can be replaced with #*<sup>&</sup>lt; shdw*∧¬*σ*˜(w*, p*). Finally, since *wht* and *shdw* label disjoint positions, the property that every white position *p* satisfies (♥) can be written as an LTL**F***,***Half** formula *<sup>ϕ</sup>*(♥) := **<sup>G</sup>** (*wht* <sup>→</sup> **Half** ([*wht* <sup>∧</sup> *<sup>σ</sup>*] <sup>∨</sup> [*shdw* ∧ ¬*σ*˜])). Its correctness follows from the correctness of each arithmetic transformation and the semantics of LTL**F***,***Half**.

For the property (♦), we first need to define formulae detecting the last and the second to last positions of the model. Detecting the last position is easy: since the last position of w is shadow, it is sufficient to express that it sees only

shadows in its future, *i*.*e*. *ϕex*<sup>2</sup> *last* := **G** (*shdw*). Similarly, a position is second to last if it is white and it sees only white or last positions in the future, which results in a formula *ϕex*<sup>2</sup> *stl* := *wht* <sup>∧</sup> **<sup>G</sup>** (*wht* <sup>∨</sup> *<sup>ϕ</sup>ex*<sup>2</sup> *last*). Note that the correctness of *ϕex*<sup>2</sup> *last* and *ϕex*<sup>2</sup> *stl* follows immediately from shadowness. Hence, we can define the formula *<sup>ϕ</sup>*(♦) as **<sup>F</sup>** (*ϕex*<sup>2</sup> *stl* <sup>∧</sup> *<sup>σ</sup>*) <sup>↔</sup> **<sup>F</sup>** (*ϕex*<sup>2</sup> *last* ∧ *σ*˜). The conjunction of *ϕ*(♥) and *ϕ*(♦) formulae gives us to *ϕtrans <sup>σ</sup>σ*˜ .

We consider a generalisation of shadowy models, where each shadow mimics all letters from a finite set *Σ* ⊆ AP rather than just a single letter *σ*. Such a generalisation is described below. In what follows, we always assume that for each *σ* ∈ *Σ* there is a unique *σ*˜, which is different from *σ*, and *σ*˜ ∈ *Σ*. Moreover, we always assume that *σ*<sup>1</sup> = *σ*<sup>2</sup> implies *σ*˜<sup>1</sup> = ˜*σ*2.

**Definition 2.** *Let <sup>Σ</sup>* <sup>⊆</sup> AP \ {*wht, shdw*} *be a finite set. A shadowy word* <sup>w</sup> *is called* truly *Σ*-shadowy*, if for every letter σ* ∈ *Σ only the white (resp. shadow) positions of* w *can be labelled with σ (resp. σ*˜*) and every white position p of* w *satisfies* w*, p* |= *σ* ⇔ w*, p*+1 |= ˜*σ.*

Knowing the solution for the previous exercise, it is easy to come up with a formula *ψtruly*−*<sup>Σ</sup> shadowy* defining truly *Σ*-shadowy models: just take the conjunction of *ψshadowy* and *ϕtrans <sup>σ</sup>σ*˜ over all letters *σ* ∈ *Σ*. The correctness follows immediately from from Exercise 2.

**Corollary 1.** *The formula ψtruly*−*<sup>Σ</sup> shadowy defines the language of truly Σ-shadowy words.*

The next exercise shows how to compare cardinalities in LTL**F***,***Half** over truly *Σ*-shadowy models. We are not going to introduce any novel techniques here, but the exercise is of great importance: it is used in the next section to encode zero tests of Minsky machines.

*Exercise 3.* Let *<sup>Σ</sup>* be a finite subset of AP \ {*wht, shdw*} and let *<sup>α</sup>*=*<sup>β</sup>* <sup>∈</sup> *<sup>Σ</sup>*. There exists an LTL**F***,***Half** formula *ψ*#*α*=#*<sup>β</sup>* such that for any truly *Σ*-shadowy word w and any of its white positions *p*: the equivalence w*, p* |= *ψ*#*α*=#*<sup>β</sup>* ⇔ #*<sup>&</sup>lt; wht*∧*<sup>α</sup>*(w*, p*)=#*<sup>&</sup>lt; wht*∧*<sup>β</sup>*(w*, p*) holds.

The solution is in the appendix, here we briefly discuss the main idea. Follow the previous exercise. The main difficulty is to express the equality of counting terms, written as LHS = RHS. Note that it is clearly equivalent to LHS + ( *<sup>p</sup>* <sup>2</sup> − RHS) = *<sup>p</sup>* <sup>2</sup> . Unfold *<sup>p</sup>* <sup>2</sup> on the left hand side, *i*.*e*. replace it with the total number of shadows in the past. Use the fact that w satisfies *ϕtrans <sup>σ</sup>σ*˜ , which implies the equality #*<sup>&</sup>lt; wht*∧*<sup>β</sup>*(w*, p*) <sup>=</sup> #*<sup>&</sup>lt; shdw*∧*β*˜(w*, p*). Finally, get rid of subtraction and write an LTL**F***,***Half** formula by employing **Half**. The presented exercises show that the expressive power of LTL**F***,***Half** is so high that, under a mild assumption of truly-shadowness, it allows us to perform cardinality comparison. We are now only a step away from showing undecidability of the logic, which is tackled next.

#### **4 Undecidability of LTL extensions**

This section is dedicated to the main technical contribution of the paper, namely that LTL**F***,***Half**, LTL**F***,***PM** and LTL**F***,***MFL** have undecidable satisfiability and model checking problems. We start from LTL**F***,***Half**. Then, the undecidability of LTL**F***,***PM** will follow immediately from the fact that **Half** is definable by **PM**. Finally, we will show how the undecidability proof can be adjusted to LTL**F***,***MFL**.

We start by recalling the basics on Minsky Machines.

*Minsky machines* A *deterministic Minsky machine* is, roughly speaking, a finite transition system equipped with two unbounded-size natural counters, where each counter can be incremented, decremented (only in the case it is positive), and tested for being zero. Formally, a Minsky machine A is composed of a finite set of *states Q* with a distinguished *initial* state *q*<sup>0</sup> and a transition function *δ* : (*Q*× {0*,* <sup>+</sup>}<sup>2</sup>) <sup>→</sup> ({−1*,* <sup>0</sup>*,* <sup>1</sup>}<sup>2</sup>×(*Q*\ {*q*0}) satisfying three additional requirements: whenever *δ*(*q, f, s*)=(¯*f ,* ¯*s, q*- ) holds, ¯*<sup>f</sup>* <sup>=</sup> <sup>−</sup><sup>1</sup> implies *<sup>f</sup>* <sup>=</sup> <sup>+</sup>, ¯*<sup>s</sup>* <sup>=</sup> <sup>−</sup><sup>1</sup> implies *<sup>s</sup>* <sup>=</sup> <sup>+</sup> (*i*.*e*. it means that only the positive counters can be decremented) and *q* = *q*- (the machine cannot enter the same state two times in a row). Intuitively, the first coordinate of *δ* describes the current state of the machine, the second and the third coordinates tell us whether the current value of the *i*-th counter is zero or positive, the next two coordinates denote the update on the counters and the last coordinate denotes the target state.

We define a *run* of a Minsky machine A as a sequence of consecutive transitions of <sup>A</sup>. Formally, a run of <sup>A</sup> is a finite word <sup>w</sup> <sup>∈</sup> (*Q*×{0*,* <sup>+</sup>}<sup>2</sup> × {−1*,* <sup>0</sup>*,* <sup>1</sup>}<sup>2</sup> <sup>×</sup> *<sup>Q</sup>* \ {*q*0})<sup>+</sup> such that, when denoting <sup>w</sup>*<sup>i</sup>* as (*q<sup>i</sup> , f<sup>i</sup> , si ,* ¯*f i ,* ¯*si , qi <sup>N</sup>* ), all the following conditions are satisfied:


It is not hard to see that this definition is equivalent to the classical one [26]. We say that a Minsky machine *reaches* a state *q* ∈ *Q* if there is a run with a letter containing *q* on its last coordinate. It is well known that the problem of checking whether a given Minsky machine reaches a given state is undecidable [26].

#### **4.1 "Half of" meets the halting problem**

We start from presenting the overview of the claimed reduction. Until the end of Section 4, let us fix a Minsky machine A = (*Q, q*0*, δ*) and its state q ∈ *Q*. Our ultimate goal is to define an LTL**F***,***Half** formula *ψ*<sup>q</sup> <sup>A</sup> such that *<sup>ψ</sup>*<sup>q</sup> <sup>A</sup> has a model iff A reaches q. To do so, we define a formula *ψ*<sup>A</sup> such that there is a one-to-one correspondence between the models of *ψ*<sup>A</sup> and runs of A. Expressing the reachability of *q*, and thus *ψ*<sup>q</sup> <sup>A</sup>, based on *<sup>ψ</sup>*<sup>A</sup> is easy.

Intuitively, the formula *ψ*<sup>A</sup> describes a shadowy word w encoding on its white positions the consecutive letters of a run of A. In order to express it, we introduce a set *Σ*A, composed of the following distinguished atomic propositions:


We formalise the one-to-one correspondence as the function *run*, which takes an appropriately defined shadowy model and returns a corresponding run of A. More precisely, the function *run*(w) returns a run whose *i*th configuration is (*q, f, s,* ¯*f ,* ¯*s, q<sup>N</sup>* ) if and only if the *i*th white configuration of w is labelled with *fromq, fVal<sup>f</sup> , sVals, fOP*¯*<sup>f</sup> , sOP*¯*<sup>s</sup>* and *to<sup>q</sup><sup>N</sup>* .

The formula *ψ*<sup>A</sup> ensures that its models are truly *Σ*A-shadowy words representing a run satisfying properties P1–P4. To construct it, we start from *ψtruly*−*Σ*<sup>A</sup> *shadowy* and extending it with four conjuncts. The first two of them represent properties P1–P2 of runs. They can be written in LTL**<sup>F</sup>** in an obvious way.

To ensure the satisfaction of the property P3, we observe that in some sense the letters *from<sup>q</sup>* and *to<sup>q</sup>* are paired in a model, *i*.*e*. always after reaching a state in A you need to get out of it (the initial state is an exception here, but we assumed that there are no transitions to the initial state). Thus, to identify for which *q* we should set the *from<sup>q</sup>* letter on the position *p*, it is sufficient to see for which state we do not have a corresponding pair, *i*.*e*. for which state *q* the number of white *from<sup>q</sup>* to the left of *p* is not equal to the number of white *to<sup>q</sup>* to the left of *p*. We achieve this in the spirit of Exercise 3.

Finally, the satisfaction of the property P4 can be achieved by checking for each position *p* whether the number of white *fOP*+1 to the left of *p* is the same as the number of white *fOP*−<sup>1</sup> to the left of *<sup>p</sup>*, and similarly for the second counter. This reduces to checking an equicardinality of certain sets, which can be done by employing shadows and Exercise 3.

*The reduction* Now we are ready to present the claimed reduction.

We first restrict the class of models under consideration to truly *Σ*A-shadowy words (for the feasibility of equicardinality encoding) with a formula *ψtruly*−*Σ*<sup>A</sup> *shadowy* . Then, we express that the models satisfy properties P1 and P2. The first property can be expressed with *<sup>ψ</sup><sup>P</sup>* <sup>1</sup> := *from<sup>q</sup>*<sup>0</sup> <sup>∧</sup> *fVal*<sup>0</sup> <sup>∧</sup> *sVal*0.

The property P2 will be a conjunction of two formulae. The first one, namely *ψ*1 *<sup>P</sup>* <sup>2</sup>, is an immediate implementation of P2. The second one, *i*.*e*. *ψ*<sup>2</sup> *<sup>P</sup>* <sup>2</sup>, is not necessary, but simplifies the proof; we require that no position is labelled by more

than six letters from *Σ*A.

$$\begin{array}{rcl} \psi\_{P2}^{1} :=& \mathbf{G} \, (wht \to \bigvee\_{\delta \mid q, f, s \big) = (\overline{f}, \overline{s}, q\_{N})} \\ & & \delta (q, f, s) = (\overline{f}, \overline{s}, q\_{N}) \end{array} \\ \begin{array}{rcl} \psi\_{P2}^{2} :=& \mathbf{G} \, \bigwedge\_{p\_{1}, \ldots, p\_{7} \in \mathcal{L}\_{\mathcal{A}}} \neg(p\_{1} \wedge p\_{2} \wedge \cdots \wedge p\_{7}) . \end{array}$$

We put *ψ<sup>P</sup>* <sup>2</sup> := *ψ*<sup>1</sup> *<sup>P</sup>* <sup>2</sup> <sup>∧</sup> *<sup>ψ</sup>*<sup>2</sup> *<sup>P</sup>* <sup>2</sup> and *<sup>ψ</sup>enc-basics* := *<sup>ψ</sup>truly*−*Σ*<sup>A</sup> *shadowy* ∧ *ψ<sup>P</sup>* <sup>1</sup> ∧ *ψ<sup>P</sup>* <sup>2</sup>.

We now formalise the correspondence between intended models and runs. Let *run* be the function which takes a word w satisfying *ψenc-basics* and returns the word w<sup>A</sup> such that |w<sup>A</sup>| = |w|*/*2 and for each position *i* we have:

$$\begin{aligned} (\curvearrowright) : \mathfrak{w}\_i^{\mathcal{A}} &= (q, f, s, \bar{f}, \bar{s}, q\_N) \text{ iff } \\ \mathfrak{w}\_{2i} \supseteq \{ \mathit{wht}, \mathit{from}\_q, \mathit{fVal}\_f, s\mathit{Val}\_s, \mathit{fOP}\_{\bar{f}}, \mathit{sOP}\_{\bar{s}}, \mathit{toq\_N} \} . \end{aligned}$$

The definition of *ψenc-basics* makes the function run correctly defined and unambiguous, and that the results of run satisfy properties P1 and P2.

**Fact 5** *The function run is uniquely defined and returns words satisfying P1 and P2.*

What remains to be done is to ensure properties P3 and P4. Both formulas rely on the tools established in Exercise 3 and are defined as follows:

$$\psi\_{P4} := \begin{array}{lcl} \mathbf{G} \left( \mathit{wht} \rightarrow & \bigwedge\_{q \in Q \mid \{q\_0\}} (\mathit{from}\_q \vee \psi\_{\# from\_q = \# to\_q}) \right) . \\ \psi\_{P4} := & \mathbf{G} \left( fVal\_0 \rightarrow \psi\_{\# forP\_{+1} = \#fOP\_{-1}} \right) \\ & \wedge \mathbf{G} \left( sVal\_0 \rightarrow \psi\_{\#on P\_{+1} = \#sOP\_{-1}} \right) \\ & \wedge \mathbf{G} \left( wht \rightarrow \left( fVal\_0 \leftrightarrow \neg fVal\_+ \right) \right) \wedge \mathbf{G} \left( wht \rightarrow \left( sVal\_0 \leftrightarrow \neg sVal\_+ \right) \right) . \end{array}$$

**Lemma 2.** *If* <sup>w</sup> *satisfies <sup>ψ</sup>enc-basics* <sup>∧</sup> *<sup>ψ</sup><sup>P</sup>* <sup>3</sup>*, then run*(w) *satisfies P1–P3.*

*Proof.* The satisfaction of the properties P1 and P2 by *run*(w) follows from Fact 5. Ad absurdum, assume that *run*(w) does not satisfy P3. It implies the existence of a white position *p* in w such that w*, p* |= *to<sup>q</sup>* but w*, p*+2 |= *from<sup>q</sup>* for some *q* = *q*- . By our definition of Minsky machines, we conclude that w*, p* |= *from<sup>q</sup>*- for some *q*--= *q*. Thus, w*, p* |= *fromq*.

From the satisfaction of *ψ<sup>P</sup>* <sup>3</sup> by w we know that w*, p* |= *ψ*#*fromq*=#*to<sup>q</sup>* . Let *k* be the total number of positions labelled with *from<sup>q</sup>* before *p*. Since w*, p* |= *ψ*#*fromq*=#*to<sup>q</sup>* holds, by Exercise 3 we infer that the number of positions satisfying *to<sup>q</sup>* before *p* is also equal to *k*. Since w*, p*+2 |= *from<sup>q</sup>* and from the satisfaction of *ψ<sup>P</sup>* <sup>3</sup> by w we once more conclude w*, p*+2 |= *ψ*#*fromq*=#*to<sup>q</sup>* . But such a situation clearly cannot happen due to the fact that the number of *to<sup>q</sup>* in the past is equal to *k* + 1, while the number of *from<sup>q</sup>* in the past is *k*.

Finally, let us define *ψ*<sup>A</sup> as *ψenc-basics* ∧ *ψ<sup>P</sup>* <sup>3</sup> ∧ *ψ<sup>P</sup>* <sup>4</sup>. The use of ↔ in *ψ<sup>P</sup>* <sup>4</sup> guarantees that *fVal*<sup>0</sup> labels exactly the white positions having the counter empty (and similarly for the second counter). The counters are never decreased from 0, thus the white positions not satisfying *fVal*<sup>0</sup> are exactly those having the first counter positive.

The proof of the forthcoming fact relies on the correctness of Exercise 3 and is quite similar to the proof of Lemma 2, and is presented in the appendix.

**Lemma 3.** *If* <sup>w</sup> *satisfies <sup>ψ</sup>*A*, then run*(w) *is a run of* <sup>A</sup>*.*

Lastly, to show that the encoding is correct, we need to show that each run has a corresponding model. It is again easy: it can be shown by constructing an appropriate w; the white positions are defined according to (), and the shadows can be constructed accordingly.

**Fact 6** *If* <sup>w</sup><sup>A</sup> *is a run of* <sup>A</sup>*, then there is a word* <sup>w</sup> <sup>|</sup><sup>=</sup> *<sup>ψ</sup>*<sup>A</sup> *s.t. run*(w) = <sup>w</sup>A*.*

Let *ψ*<sup>q</sup> <sup>A</sup> := *<sup>ψ</sup>*<sup>A</sup> <sup>∧</sup> **<sup>F</sup>** (*to*q). Observe that the formula *<sup>ψ</sup>*<sup>q</sup> <sup>A</sup> is satisfiable if and only if A reaches q. The "if" part follows from Lemma 3 and the satisfaction of the conjunct **<sup>F</sup>** (*to*q) from *<sup>ψ</sup>*A. The "only if" part follows from Fact 6. Hence, from undecidability of the reachability problem Minsky machines we infer our main theorem:

**Theorem 1.** *The satisfiability problem for* LTL**F***,***Half** *is undecidable.*

#### **6.1 Undecidability of model-checking**

For a given alphabet *Σ*, we can define a Kripke structure K*<sup>Σ</sup>* whose set of traces is the language (2*Σ*)+: the set of states *<sup>S</sup>* of <sup>K</sup>*<sup>Σ</sup>* is composed of all subsets of *<sup>Σ</sup>*, all states are initial (*i*.*e*. *I* = *S*), the transition relation is the maximal relation (*R* = *S*×*S*) and (*X*)=*X* for any subset *X* ⊆ *Σ*. It follows that a formula *ϕ* over an alphabet *Σ* is satisfiable if and only if there is a trace of K*<sup>Σ</sup>* satisfying *ϕ*. From the undecidability of the satisfiability problem for LTL**F***,***Half** we get:

**Theorem 2.** *Model-checking of* LTL**F***,***Half** *formulae over Kripke structures is undecidable.*

The decidability can be regained if additional constraints on the shape of Kripke structures are imposed: model-checking of LTL**F***,***Half** formulae over *flat* structures is decidable [13].

As discussed earlier, the **Half** operator can be expressed in terms of the **PM** operator. Hence, we conclude:

**Corollary 2.** *Model-checking and satisfiability problems for* LTL**F***,***PM** *are undecidable.*

#### **6.2 Most-Frequent Letter and Undecidability**

We next turn our attention to the **MFL** operator, which turns out to be a little bit problematic. Typically, formulae depend only on the atomic propositions that they explicitly mentioned. Here, it is not the case. Consider a formula *ϕ* = **MFL** *a* and words w<sup>1</sup> = {*a*}{}{*a*} and w<sup>2</sup> = {*a, b*}{*b*}{*a, b*}. Clearly, w1*,* 2 |= *ϕ* whereas w2*,* 2 |= *ϕ*. This can be fixed in many ways – for example, by parametrising **MFL** with a domain, so that it expresses that "*a* is the most frequent letter among *b*1*,...,bn*". We show, however, that even this very basic version of **MFL** is undecidable. The proof is an adaptation of our previous proofs with a little twist inside.

First, we adjust the definition of shadowy words. A word *w* is *strongly shadowy* if *w* is shadowy and for each even position of w we have that *wht* and *shdw* are the most frequent letters among the other labelling w while for odd positions *wht* is the most frequent. Note that the words constructed in the previous sections were strongly shadowy because each letter *σ* appeared only at whites or at shadows.

*Exercise 4.* There exists an LTL**F***,***MFL** formula *ψMFL shadowy* defining strongly shadowy words.

*Proof.* It suffices to revisit Exercise 1 and to modify the formula *ϕex*<sup>1</sup> *odd* stipulating that odd positions are exactly those labelled with *shdw* (since it is the only formulae employing **Half** ). We claim that *ϕex*<sup>1</sup> *odd* can be expressed with

> *ϕMFL odd* := **<sup>G</sup>** [**MFL** (*wht*) <sup>∧</sup> (*wht* <sup>↔</sup> **MFL** (*shdw*))]

Indeed, take any word <sup>w</sup> <sup>|</sup><sup>=</sup> *<sup>ϕ</sup>ex*<sup>1</sup> *init* <sup>∧</sup> *<sup>ϕ</sup>MFL odd* . Of course we have <sup>w</sup>*,* <sup>0</sup> <sup>|</sup><sup>=</sup> *wht* (due to *ϕex*<sup>1</sup> *init*). Moreover, <sup>w</sup>*,* <sup>1</sup> <sup>|</sup><sup>=</sup> *shdw* holds: otherwise we would get contradiction with *shdw* not being the most frequent letter in the past of 1. Now assume *p >* 1 and assume that the word w0*,...,* w*<sup>p</sup>*−<sup>1</sup> is strongly shadowy. Consider two cases. If *p* is odd, then both *wht* and *shdw* are the most frequent letters in the past of *<sup>p</sup>*−<sup>1</sup> and *<sup>p</sup>*−<sup>1</sup> is labelled by *wht*. Then, *shdw* is not the most frequent letter in the past of *p* and thus *p* is labelled by *shdw* and *wht* is the most frequent letter in the past of *<sup>p</sup>*. If *<sup>p</sup>* is even, *<sup>p</sup>*−<sup>2</sup> is labelled by *wht* and the most frequent letters in the past of *<sup>p</sup>*−<sup>2</sup> are *wht* and *shdw*, and *<sup>p</sup>*−<sup>1</sup> is labelled by *shdw*. Thus both *wht* and *shdw* are the most frequent letters in the past of *p* and therefore *wht* is labelled by *wht*. Thus, w0*,...,* w*<sup>p</sup>* is strongly shadowy. By induction, w is strongly shadowy. It can be readily checked that every strongly shadowy word satisfies *ψMFL shadowy*.

We argue that over the strongly shadowy models, the formulae **Half** *σ* and **MFL** *σ* are equivalent.

**Lemma 4.** *For all strongly shadowy words* <sup>w</sup> <sup>|</sup><sup>=</sup> *<sup>ψ</sup>MFL shadowy, all even positions* 2*i and all letters <sup>σ</sup> we have the equivalence* <sup>w</sup>*,* <sup>2</sup>*<sup>i</sup>* <sup>|</sup><sup>=</sup> **Half** *<sup>σ</sup> iff* <sup>w</sup>*,* <sup>2</sup>*<sup>i</sup>* <sup>|</sup><sup>=</sup> **MFL** *<sup>σ</sup>.*

*Proof.* If <sup>w</sup>*,* <sup>2</sup>*<sup>i</sup>* <sup>|</sup><sup>=</sup> **MFL** *<sup>σ</sup>*, then <sup>w</sup>*,* <sup>2</sup>*<sup>i</sup>* <sup>|</sup><sup>=</sup> **MFL** *wht* due to the strongly shadowness of w. Hence #*<sup>&</sup>lt; <sup>σ</sup>* (w*,* 2*i*)=#*<sup>&</sup>lt; wht*(w*,* 2*i*) = <sup>2</sup>*<sup>i</sup>* <sup>2</sup> , implying <sup>w</sup>*,* <sup>2</sup>*<sup>i</sup>* <sup>|</sup><sup>=</sup> **Half** *<sup>σ</sup>*.

Now, assume that <sup>w</sup>*,* <sup>2</sup>*<sup>i</sup>* <sup>|</sup><sup>=</sup> **Half** *<sup>σ</sup>* holds, so *<sup>σ</sup>* appears *<sup>i</sup>* times in the past. Since w is strongly shadowy we know that *wht* is the most frequent letter. Moreover, *wht* appears <sup>2</sup>*<sup>i</sup>* <sup>2</sup> <sup>=</sup> *<sup>i</sup>* times in the past. Hence, <sup>w</sup>*,* <sup>2</sup>*<sup>i</sup>* <sup>|</sup><sup>=</sup> **MFL** *<sup>σ</sup>*.

We say that a letter *σ* is *importunate* in a word w if *σ* labels more than half of the positions in some even prefix of w. Notice that strongly shadowy words cannot have importunate letters.

With the above lemma, it is tempting to finish the proof as follows: replace each **Half** (*ϕ*) in the formulae from Section 4.1 with **MFL** (*pϕ*) for some fresh atomic proposition *<sup>p</sup><sup>ϕ</sup>* and require that **<sup>G</sup>** (*<sup>ϕ</sup>* <sup>↔</sup> *<sup>p</sup>ϕ*) holds. A formula obtained from *ϕ* in this way will be called a *dehalfication* of *ϕ* and will be denoted with dehalf(*ϕ*). The next lemma shows that dehalf(·) preserves satisfaction of certain LTL**F***,***Half** formulae.

**Lemma 5.** *Let ϕ be an* LTL**F***,***Half** *formula without nested* **Half** *operators and without* **F** *modality, Λ be the set of all formulae λ such that* **Half** *λ appears in <sup>ϕ</sup> and let* <sup>w</sup> *be a word such that* <sup>w</sup> <sup>|</sup><sup>=</sup> *<sup>ψ</sup>MFL shadowy* ∧ - *<sup>λ</sup>*∈*<sup>Λ</sup>* **<sup>G</sup>** (*p<sup>λ</sup>* <sup>↔</sup> *<sup>λ</sup>*)*. Then for all even positions* 2*p of* w *we have that* w*,* 2*p* |= dehalf(*ϕ*) *implies* w*,* 2*p* |= *ϕ. Moreover,* <sup>w</sup> <sup>|</sup><sup>=</sup> **<sup>G</sup>** (*wht* <sup>→</sup> dehalf(*ϕ*)) *implies* <sup>w</sup> <sup>|</sup><sup>=</sup> **<sup>G</sup>** (*wht* <sup>→</sup> *<sup>ϕ</sup>*)*.*

*Proof.* The proof goes via structural induction over LTL**F***,***Half** formulae without nested **Half** operators and without **F** operators. The only interesting case is when *<sup>ϕ</sup>* <sup>=</sup> **Half** *<sup>λ</sup>*, which follows from Lemma 4.

Note, however, that the above lemma works only one way: it fails when the formula *ϕ* is satisfied in more than half of the positions of some prefix, as that would make *p<sup>ϕ</sup>* importunate leading to unsatisfiablity of *ψMFL shadowy*.

#### **6.3 Most-Frequent Letter: the reduction**

The next step is to construct a formula defining truly *Σ*A-shadowy words, which are the crucial part of *ψMFL enc-basics*. To do it, we first need to rewrite a formula *ϕtrans <sup>σ</sup>σ*˜ , transferring the truth of a letter *σ* from whites into their shadows. The main ingredient of *ϕtrans <sup>σ</sup>σ*˜ is the formula *<sup>ϕ</sup>*(♥) := **<sup>G</sup>** (*wht* <sup>→</sup> **Half** ([*wht* <sup>∧</sup> *<sup>σ</sup>*] <sup>∨</sup> [*shdw* ∧ ¬*σ*˜])), which we replace with dehalf(*ϕ*(♥)). We call the obtained formula (*ϕtrans <sup>σ</sup>σ*˜ )*MFL* and show its correctness below.

First, by Lemma 5 we know that every model of (*ϕtrans <sup>σ</sup>σ*˜ )*MFL* is also a model of *ϕtrans <sup>σ</sup>σ*˜ . Then, the models of *<sup>ϕ</sup>trans <sup>σ</sup>σ*˜ can be made strongly shadowy, so dehalfication of *ϕtrans <sup>σ</sup>σ*˜ is satisfiability-preserving.

**Lemma 6.** *Let <sup>p</sup><sup>ϕ</sup> be a fresh letter for <sup>ϕ</sup>* := [*wht* <sup>∧</sup> *<sup>σ</sup>*] <sup>∨</sup> [*shdw* ∧ ¬*σ*˜]*. Take* <sup>w</sup>*, a strongly shadowy word satisfying* <sup>w</sup> <sup>|</sup><sup>=</sup> *<sup>ϕ</sup>trans <sup>σ</sup>σ*˜ *without any occurrences of pϕ. Then* w- *, the word obtained by labelling with p<sup>ϕ</sup> all the positions of* w *satisfying ϕ, is strongly shadowy.*

Hence, we obtain the correctness of (*ϕtrans <sup>σ</sup>σ*˜ )*MFL*. By applying the same strategy to other conjuncts of *ψenc-basics* and Fact 5, we obtain *ψMFL enc-basics* satisfying:

**Corollary 3.** *The function run (taking as input the words satisfying ψMFL enc-basics) is uniquely defined and returns words satisfying P1 and P2. Moreover the formulae ψMFL enc-basics and ψenc-basics are equi-satisfiable.*

Towards completing the undecidability proof we need to prepare the rewritings of the formulae *ψ<sup>P</sup>* <sup>3</sup> and *ψ<sup>P</sup>* <sup>4</sup>. For *ψ<sup>P</sup>* <sup>3</sup> we proceed similarly to the previous case. We know that the models of *ψMFL enc-basics* ∧dehalf(*ψ<sup>P</sup>* <sup>3</sup>) satisfy P3 (due to Lemma 5 they satisfy *ψ<sup>P</sup>* <sup>3</sup> and hence, by Lemma 2, also P3). To observe the existence of such models, we show again that the satisfiability of *ψ<sup>P</sup>* <sup>3</sup> is preserved by dehalfication.

**Lemma 7.** *Let <sup>p</sup><sup>q</sup> be a fresh letter for <sup>ϕ</sup><sup>q</sup>* := [*wht*∧*fromq*]∨[*shdw*∧¬*to<sup>q</sup>*] *indexed over <sup>q</sup>* <sup>∈</sup> *<sup>Q</sup>*\{*q*0}*. Take* <sup>w</sup>*, a strongly shadowy word satisfying* <sup>w</sup> <sup>|</sup><sup>=</sup> *<sup>ψ</sup>MFL enc-basics*∧*ψ<sup>P</sup>* <sup>3</sup> *without any occurrences of pq. Then* w- *, the word obtained by labelling with p<sup>q</sup> all the positions of* w *satisfying ϕq, is strongly shadowy.*

From Lemma 2, Lemma 7 and Lemma 5 we immediately conclude:

**Corollary 4.** *If* w *satisfies ψMFL enc-basics* <sup>∧</sup> dehalf(*ψ<sup>P</sup>* <sup>3</sup>)*, then run*(w) *satisfies P1– P3. Moreover the formulae ψMFL enc-basics* ∧ dehalf(*ψ<sup>P</sup>* <sup>3</sup>) *and ψenc-basics* ∧ *ψ<sup>P</sup>* <sup>3</sup> *are equi-satisfiable.*

The last formula to rewrite is *ψ<sup>P</sup>* <sup>4</sup>. We focus only on its first part, speaking about the first counter, *i*.*e*.

**<sup>G</sup>** (*fVal*<sup>0</sup> <sup>→</sup> **Half** ([*wht* <sup>∧</sup> *fOP*+1] <sup>∨</sup> [*shdw* ∧ ¬*fOP*-<sup>−</sup><sup>1</sup>]) <sup>∧</sup> **<sup>G</sup>** (*wht* <sup>→</sup> (*fVal*<sup>0</sup> ↔ ¬*fVal*+)) Note that this time we cannot simply dehalfise this formula: the letter responsible for the inner part of **Half** would necessarily be importunate – consider an initial fragment of a run of A in which A increments its first counter without decrementing it. Fortunately, we cannot say the same when the machine decrements the counter and hence, it suffices to express the equivalent (due to even length of shadowy models) statement *ψ*- *<sup>P</sup>* <sup>4</sup> as follows: **<sup>G</sup>** (*fVal*<sup>0</sup> <sup>→</sup> **Half** <sup>¬</sup>([*wht* <sup>∧</sup> *fOP*+1] <sup>∨</sup> [*shdw* ∧ ¬*fOP*-<sup>−</sup><sup>1</sup>]) <sup>∧</sup> **<sup>G</sup>** (*wht* <sup>→</sup> (*fVal*<sup>0</sup> ↔ ¬*fVal*+)).

As we did before, we show that dehalfication of *ψ*- *<sup>P</sup>* <sup>4</sup> preserves satisfiability:

**Lemma 8.** *Let <sup>p</sup><sup>ϕ</sup> be a fresh letter for <sup>ϕ</sup>* := <sup>¬</sup>([*wht* <sup>∧</sup>*fOP*+1]∨[*shdw* ∧ ¬*fOP*-<sup>−</sup>1])*. Take* <sup>w</sup>*, a strongly shadowy word satisfying* <sup>w</sup> <sup>|</sup><sup>=</sup> *<sup>ψ</sup>MFL enc-basics* ∧ dehalf(*ψ<sup>P</sup>* <sup>3</sup>) ∧ *ψ*- *P* 4 *without any occurrences of pϕ. Then* w- *, the word obtained by labelling with p<sup>ϕ</sup> all the positions of* w *satisfying ϕ, is strongly shadowy.*

Finally, let (*ψ*<sup>q</sup> <sup>A</sup>)*MFL* := *<sup>ψ</sup>MFL enc-basics* <sup>∧</sup> dehalf(*ψ<sup>P</sup>* <sup>3</sup>) <sup>∧</sup> dehalf(*ψ<sup>P</sup>* <sup>4</sup>) <sup>∧</sup> **<sup>F</sup>** *to*q. From Lemma 3, Lemma 8 and Lemma 5 we immediately conclude:

**Corollary 5.** *If* w *satisfies* (*ψ*<sup>q</sup> A)*MFL then it satisfies P1–P4. Moreover the formulae* (*ψ*<sup>q</sup> <sup>A</sup>)*MFL and <sup>ψ</sup>*<sup>q</sup> <sup>A</sup> *are equi-satisfiable.*

Thus, by Theorem 1 and the above corollary, we obtain the undecidability of LTL**F***,***MFL**. Undecidability of the model-checking problem is concluded by virtually the same argument as in Section 6.1. Hence:

**Theorem 3.** *The model-checking and the satisfiability problems for* LTL**F***,***MFL** *are undecidable.*

### **7 Decidable variants**

We have shown that LTL**<sup>F</sup>** with frequency operators lead to undecidability. Without the operators that can express **F** (*e*.*g*. **F**, **G** or **U** ), the decision problems become NP-complete. Below we assume the standard semantics of LTL operator **<sup>X</sup>** , *<sup>i</sup>*.*e*. <sup>w</sup>*, i* <sup>|</sup><sup>=</sup> **<sup>X</sup>** *<sup>ϕ</sup>* iff *<sup>i</sup>*+1 *<sup>&</sup>lt;* <sup>|</sup>w<sup>|</sup> and <sup>w</sup>*, i*+1 <sup>|</sup><sup>=</sup> *<sup>ϕ</sup>*.

**Theorem 4.** *Model-checking and satisfiability problems for* LTL**X***,***MFL***,***PM** *are NP-complete.*

The complexity of LTL**X***,***MFL***,***PM** is so low because the truth of the formula depends only on some initial fragment of a trace. This is a big restriction of the expressive power. Thus, we consider a different approach motivated by [7].

In the new setting, we allow to use arbitrary LTL formulae as well as percentage operators as long as the they are not mixed with **G** . We introduce a logic LTL%, which extends the classical LTL [29] with the percentage operators of the form **<sup>P</sup>***<sup>k</sup>*%*<sup>ϕ</sup>* for any ∈ {≤*, <,* <sup>=</sup>*, >,* ≥ }, *<sup>k</sup>* <sup>∈</sup> <sup>N</sup> and *<sup>ϕ</sup>* <sup>∈</sup> LTL. By way of example, the formula **P***<sup>&</sup>lt;*20%(*a*) is true at a position *p* if less then 20% of positions before *p* satisfy *a*. The past majority operator is a special case of the percentage operator: **PM** <sup>≡</sup> **<sup>P</sup>** <sup>≥</sup>50%. Formally:

$$\text{tr}\mathfrak{w}, i \mid = \mathbf{P}\_{\flat \approx k\%} \varphi \text{ if } |\{j < i \colon \mathfrak{w}, j \mid = \varphi\}| \bowtie \lnot \frac{k}{100} i!$$

To avoid undecidability, the percentage operators cannot appear under negation or be nested. Therefore, the syntax of LTL% is defined with the grammar *ϕ, ϕ*- ::= *ψ*LTL | *ϕ* ∨ *ϕ*- | *ϕ* ∧ *ϕ*- <sup>|</sup> **<sup>F</sup>** (*ψ*LTL <sup>∧</sup> **<sup>P</sup>***<sup>k</sup>*%*ψ*- LTL), where *ψ*LTL, *ψ*- LTL are (full) LTL formulae.

The main tool used in the decidability proof is the Parikh Automata [21]. A Parikh automaton P = (A*,* E) over the alphabet *Σ* is composed of a finitestate automaton A accepting words from *Σ*<sup>∗</sup> and a semi-linear set E given as a system of linear inequalities with integer coefficients, where the variables are *x<sup>a</sup>* for *a* ∈ *Σ*. We say that P accepts a word w if A accepts w and the mapping assigning to each variable *x<sup>a</sup>* from E the total number of positions of w carrying the letter *a*, is a solution to E. Checking non-emptiness of the language of P can be done in NP [17]. Our main decidability results is obtained by constructing an appropriate Parikh automaton recognising the models of an input LTL% formula.

#### **Theorem 5.** *Model-checking and satisfiability problems for* LTL% *are decidable.*

*Proof.* Let *<sup>ϕ</sup>* <sup>∈</sup> LTL%. By turning *<sup>ϕ</sup>* into a DNF, we can focus on checking satisfiability of some of its conjuncts. Hence, w.l.o.g. we assume that *ϕ* = *ϕ*<sup>0</sup> ∧ *n <sup>i</sup>*=1 *<sup>ϕ</sup>i*, where *<sup>ϕ</sup>*<sup>0</sup> is in LTL and all *<sup>ϕ</sup><sup>i</sup>* have the form **<sup>F</sup>** (*ψi,*<sup>1</sup> LTL <sup>∧</sup> **<sup>P</sup>***ki*%*ψi,*<sup>2</sup> LTL) for some LTL formulae *ψi,*<sup>1</sup> LTL and *<sup>ψ</sup>i,*<sup>2</sup> LTL. Observe that a word w is a model of *ϕ* iff it satisfies *ϕ*<sup>0</sup> and for each conjunct *ϕ<sup>i</sup>* we can pick a witness position *p<sup>i</sup>* from w such that <sup>w</sup>*, p<sup>i</sup>* <sup>|</sup><sup>=</sup> *<sup>ψ</sup>i,*<sup>1</sup> LTL <sup>∧</sup> **<sup>P</sup>***<sup>k</sup>i*%*ψi,*<sup>2</sup> LTL. Moreover, the percentage constraints inside such formulae speak only about the prefix w*<p<sup>i</sup>* . Thus, knowing the position *p<sup>i</sup>* and the number of positions before *p<sup>i</sup>* satisfying *ψi,*<sup>2</sup> LTL, the percentage constraint inside *ϕ<sup>i</sup>* can be imposed globally rather than locally. It suggests the use of Parikh

automata: the LTL part of *ϕ* can be checked by the appropriate automaton A (due to the correspondence that for an LTL formula over finite words one can build a finite-state automaton recognising the models of such a formula [19]) and the global constraints, speaking about the satisfaction of percentage operators, can be ensured with a set of linear inequalities E.

Our plan is as follows: we decorate the intended models w with additional information on witnesses, such that the witness position *p<sup>i</sup>* for *ϕ<sup>i</sup>* will be labelled by *w<sup>i</sup>* (and there will be a unique such position in a model), all positions before *p<sup>i</sup>* will be labelled by *b<sup>i</sup>* and, among them, we distinguish with a letter *s<sup>i</sup>* some special positions, *i*.*e*. those satisfying *ψi,*<sup>2</sup> LTL. More formally, for each *ϕ<sup>i</sup>* we produce an LTL formula *ϕ*- *<sup>i</sup>* according to the following rules:


Let *ϕ*- := *ϕ*<sup>0</sup> ∧ *n <sup>i</sup>*=1 *ϕ*- *<sup>i</sup>* ∧ *n <sup>i</sup>*=1 **<sup>F</sup>** (*p<sup>i</sup>* <sup>∧</sup> **<sup>P</sup>***<sup>k</sup>i*%*si*). Note that <sup>w</sup> <sup>|</sup><sup>=</sup> *<sup>ϕ</sup>* implies w |= *ϕ*. Moreover, any model w |= *ϕ* can be labelled with letters *bi, si, w<sup>i</sup>* such that the decorated word satisfies *ϕ*- . Let *ϕ*-- := *ϕ*<sup>0</sup> ∧ *n <sup>i</sup>*=1 *ϕ*- *<sup>i</sup>* and let E be the system of *n* inequalities with E*<sup>i</sup>* = 100 · *x<sup>b</sup><sup>i</sup> k<sup>i</sup>* · *x<sup>s</sup><sup>i</sup>* . Now observe that any model of *ϕ* satisfies E (i.e. the value assigned to *x<sup>a</sup>* is the total number of positions labelled with a), due to the satisfaction of counting operators, and vice versa: every word w |= *ϕ*- satisfying E is a model of *ϕ*--. It gives us a sufficient characterisation of models of *ϕ*. Let A be a finite automaton recognising the models of *ϕ*--, then a Parikh automaton P = (A*,* E), as we already discussed, is non-empty if and only if *ϕ* has a model. Since checking non-emptiness of P is decidable, we can conclude that LTL% is decidable.

A rough complexity analysis yields an NExpTime upper bound on the problem: the automaton P that we constructed is exponential in *ϕ* (translating *ϕ* to DNF does not increase the complexity since we only guess one conjunct, which is of polynomial size in *ϕ*). Moreover, checking non-emptiness can be done non-deterministically in time polynomial in the size of the automaton. The NExpTime bound is not optimal: we conjuncture that the problem is PSpace-complete. We believe that by employing techniques similar to [7], one can construct P and check its non-emptiness on the fly, which should result in the PSpace upper bound.

For the model-checking problem, we observe that determining whether some trace of a Kripke structure K = (*S, I, R, l*) satisfies *ϕ* is equivalent to checking the satisfiability of formula *ϕ*<sup>K</sup> ∧*ϕ*, where *ϕ*<sup>K</sup> is a formula describing all the traces of K. Such a formula can be constructed in a standard manner. For simplicity, we treat *S* as a set of auxiliary letters, and consider the conjunction of (1) *<sup>s</sup>*∈*<sup>I</sup> <sup>s</sup>*, (2) **<sup>G</sup>** (**<sup>X</sup>** → (*s,s*-)∈*<sup>R</sup>*(*<sup>s</sup>* <sup>∧</sup> **<sup>X</sup>** *<sup>s</sup>*- )) and (3) - *<sup>s</sup>*∈*<sup>S</sup>* **<sup>G</sup>** (*<sup>s</sup>* <sup>→</sup> - *<sup>p</sup>*∈(*s*) *<sup>p</sup>*), expressing that the trace starts with an initial state, consecutive positions describe consecutive states and that the trace is labelled by the appropriate letters. Thus, the modelchecking problem can be reduced in polynomial time to the satisfiability problem.

#### **8 Two-Variable First-Order Logic with Majority**

The *Two-Variable First-Order Logic on words* (FO<sup>2</sup>[*<*]) is a robust fragment of First-Order Logic FO interpreted on finite words. It involves quantification over variables *x* and *y* (ranging over the words' positions) and it admits a linear order predicate *<* (interpreted as a natural order on positions) and the equality predicate =. Henceforth we assume the usual semantics of FO<sup>2</sup>[*<*] (*cf*. [16]).

In this section, we investigate the logic FO<sup>2</sup> <sup>M</sup>[*<*], namely the extension of FO<sup>2</sup>[*<*] with the so-called *Majority quantifier* M. Such quantifier was intensively studied due to its close connection with circuit complexity and algebra, see *e*.*g*. [22,5,6]. Intuitively, the formula M*x.ϕ* specifies that at least half of all the positions in a model, after substituting *x* with them, satisfy *ϕ*. Formally w |= M*x.ϕ* holds, if and only if <sup>|</sup>w<sup>|</sup> <sup>2</sup> ≤ |{*p* | w*, p* |= *ϕ*[*x/p*]}|. We stress that the formula M*x.ϕ* may contain free occurrences of the variable *y*.

Note that the Majority quantifier shares similarities to the **PM** operator, but in contrast to **PM**, the M quantifier counts *globally*. We take advantage of such similarities and by reusing the technique developed in the previous sections, we show that the satisfiability problem for FO<sup>2</sup> <sup>M</sup>[*<*] is also undecidable. We stress that our result significantly sharpens an existing undecidability result for FO with Majority from [23] (since in our case the number of variables is limited) as well as for FO<sup>2</sup>[*<, succ*] with Presburger Arithmetics from [25] (since our counting mechanism is limited and the successor relation *succ* is disallowed).

*Proof plan* There are three possible approaches to proving the undecidability of FO<sup>2</sup> <sup>M</sup>[*<*]. The first one is to reproduce all the results for LTL**F***,***PM**, which is rather uninspiring. The second one is to define a translation from LTL**F***,***PM** to FO<sup>2</sup> <sup>M</sup>[*<*] that produces an equisatisfiable formula. But because of models of odd length, this involves a lot of case study. Here we present a third approach, which, we believe, gives the best insight: we show a translation from LTL**F***,***PM** to FO<sup>2</sup> <sup>M</sup>[*<*] that works for LTL**F***,***PM** formulae whose all models are shadowy. Since we only use such models in the undecidability proof of LTL**F***,***PM**, this shows the undecidability of FO<sup>2</sup> <sup>M</sup>[*<*].

*Shadowy models* We first focus on defining shadowy words in FO<sup>2</sup> <sup>M</sup>[*<*]. Before we start, let us introduce a bunch of useful macros in order to simplify the forthcoming formulae. Their names coincide with their intuitive meaning and their semantics.

**–** Half*x.ϕ* := M*x.ϕ* ∧ M*x.*¬*ϕ*, **–** *first*(*x*) := ¬∃*y y < x, second*(*x*) := <sup>∃</sup>*yy<x* ∧ ∀*yy<x* <sup>→</sup> *first*(*y*),

$$-\overline{\underline{\operatorname{last}}}(x) \; := \neg \exists y \; y > x, \; \underline{\operatorname{set}} \operatorname{last}(x) \; := \neg \exists y \; y > x \land \forall y \; y > x \to \overline{\underline{\operatorname{last}}}(y)$$

**Lemma 9.** *There is an* FO<sup>2</sup> <sup>M</sup>[*<*] *formula <sup>ψ</sup>FO shadowy defining shadowy words.*

*Proof.* Let *ϕlem*<sup>9</sup> *base* be a formula defining the language of all (non-empty) words, where the letters *wht* and *shdw* label disjoint positions in the way that the first position satisfies *wht* and the total number of *shdw* and *wht* coincide. It can be

written, *<sup>e</sup>*.*g*. with <sup>∀</sup>*x*(*wht*(*x*) ↔ ¬*shdw*(*x*))∧∃*x*(*first*(*x*)∧*wht*(*x*))∧Half*x.wht*(*x*)<sup>∧</sup> Half*x.shdw*(*x*). To define shadowy words, it would be sufficient to specify that no neighbouring positions carry the same letter among { *wht, shdw* }. This can be done with, rather complicated at the first glance, formulae:

$$\varphi\_{wht\cdot wht}^{for bid}(x) \; := \; \mathit{wht}(x) \rightarrow \; \mathsf{HAlg}.\left([y < x \wedge wht(y)] \lor [x < y \wedge shdw(y)]\right),$$

$$\varphi\_{shdw\cdot shdw}^{forid}(x) \; := \; \mathit{shdw}(x) \rightarrow \; \mathsf{HAlg}.\left([(y < x \lor x = y) \wedge shdw(y)] \lor [x < y \wedge wht(y)]\right).$$

$$\text{Finally, let } \psi\_{shdw\cdot w}^{FO} \; := \; \varphi\_{base}^{lem9} \land \forall x.\left(\varphi\_{wht\cdot wht}^{forid}(x) \land \varphi\_{shdw\cdot shdw}^{forid}(x)\right).$$

Showing that shadowness implies the satisfaction of *ψFO shadowy* can be done by routine induction. For the opposite direction, take <sup>w</sup> <sup>|</sup><sup>=</sup> *<sup>ψ</sup>FO shadowy*. Since <sup>w</sup> <sup>|</sup><sup>=</sup> *<sup>ϕ</sup>lem*<sup>9</sup> *base* the only possibility for w to not be shadowy is to have two consecutive positions *p, p*+1 carrying the same letter. W.l.o.g assume they are both white. Let *w* be the number of white positions to the left of *p* and let *s* be the number of shadows to the right of *p*. By applying *ϕforbid wht*·*wht* to *<sup>p</sup>* we infer that *<sup>w</sup>* <sup>+</sup> *<sup>s</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup> |w|. On the other hand, by applying *ϕforbid wht*·*wht* to *<sup>p</sup>*+1 it follows that (*w*+1)+*<sup>s</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup> |w|, which contradicts the previous equation. Hence, w is shadowy.

*Translation* It is a classical result from [16] that FO<sup>2</sup>[*<*] can express LTL**F**. We define a translation tr*v*(*ϕ*) from LTL**F***,***PM** to FO<sup>2</sup> <sup>M</sup>[*<*], parametrised by a variable *v* (where *v* is either *x* or *y* and *v*¯ denotes the different variable from *v*), inductively. We write *v* ≤ *v*¯ rather than *v < v*¯ ∨ *v* = *v*¯ for simplicity. For LTL**<sup>F</sup>** cases, we follow [16]: tr*v*(*a*) := *a*(*v*), for a fresh unary predicate *a* for each *a* ∈ AP, tr*v*(¬*ϕ*) := ¬tr*v*(*ϕ*), tr*v*(*ϕ* ∧ *ϕ*- ) := tr*v*(*ϕ*) ∧ tr*v*(*ϕ*- ), tr*v*(**F** *ϕ*) := <sup>∃</sup>*v*¯ (*<sup>v</sup>* <sup>≤</sup> *<sup>v</sup>*¯) <sup>∧</sup> tr*<sup>v</sup>*¯(*ϕ*). For **PM**, we propose tr*v*(**PM***ϕ*) := <sup>M</sup>*v*¯((*v<v* ¯ <sup>∧</sup> tr*<sup>v</sup>*¯(*ϕ*)) <sup>∨</sup> (*v*¯ <sup>≥</sup> *<sup>v</sup>* <sup>∧</sup> *wht*(*v*¯))). Finally, for a given LTL**F***,***PM** formula *<sup>ϕ</sup>*, let tr(*ϕ*) stand for *ψFO shadowy* ∧ ∃*x.*(*first*(*x*) <sup>∧</sup> tr*x*(*ϕ*)).

The following lemma shows the correctness of the presented translation.

**Lemma 10.** *An* LTL**F***,***PM** *formula ϕ has a shadowy model iff* tr(*ϕ*) *has a model.*

Since the formulae used in our undecidability proof for LTL**F***,***PM** have only shadowy models, by Lemma 10 we conclude that FO<sup>2</sup> <sup>M</sup>[*<*] is also undecidable.

**Theorem 6.** *The satisfiability problem for* FO<sup>2</sup> <sup>M</sup>[*<*] *is undecidable.*

### **9 Conclusions**

We have provided a simple proof showing that adding different percentage operators to LTL**<sup>F</sup>** yields undecidability. We showed that our technique can be applied to an extension of first-order logic on words, and we hope that our work will turn useful in showing undecidability for other extensions of temporal logics. Decidability results for logics with percentage operators in restricted contexts were also provided.

### **Acknowledgements**

Bartosz Bednarczyk was supported by the Polish Ministry of Science and Higher Education program "Diamentowy Grant" no. DI2017 006447. Jakub Michaliszyn was supported by NCN grant no. 2017/27/B/ST6/00299.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/ 4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

#### Combining Semilattices and Semimodules*-*

Filippo Bonchi and Alessio Santamaria(-)

Dipartimento di Informatica, Università degli Studi di Pisa, Pisa, Italy filippo.bonchi@unipi.it, alessio.santamaria@di.unipi.it

Abstract. We describe the canonical weak distributive law δ : SP → PS of the powerset monad P over the S-left-semimodule monad S, for a class of semirings S. We show that the composition of P with S by means of such <sup>δ</sup> yields *almost* the monad of convex subsets previously introduced by Jacobs: the only difference consists in the absence in Jacobs's monad of the empty convex set. We provide a handy characterisation of the canonical weak lifting of <sup>P</sup> to EM(S) as well as an algebraic theory for the resulting composed monad. Finally, we restrict the composed monad to finitely generated convex subsets and we show that it is presented by an algebraic theory combining semimodules and semilattices with bottom, which are the algebras for the finite powerset monad P<sup>f</sup> .

Keywords: algebraic theories · monads · weak distributive laws.

#### 1 Introduction

Monads play a fundamental role in different areas of computer science since they embody notions of computations [32], like nondeterminism, side effects and exceptions. Consider for instance automata theory: deterministic automata can be conveniently regarded as certain kind of coalgebras on Set [33], nondeterministic automata as the same kind of coalgebras but on EM(P<sup>f</sup> ) [35], and weighted automata on EM(S) [4]. Here, <sup>P</sup><sup>f</sup> is the finite powerset monad, modelling nondeterministic computations, while S is the monad of semimodules over a semiring S, modelling various sorts of quantitative aspects when varying the underlying semiring S. It is worth mentioning two facts: first, rather than taking coalgebras over EM(T), the category of algebras for the monad T, one can also consider coalgebras over Kl(T), the Kleisli category induced by T [20]; second, these two approaches based on monads have lead not only to a deeper understanding of the subject, but also to effective proof techniques [6,7,14], algorithms [1,8,22,36,39] and logics [19,21,27].

Since compositionality is often the key to master complex structures, computer scientists devoted quite some efforts to *compose monads* [40] or the equivalent notion of *algebraic theories* [24]. Indeed, the standard approach of composing monads by means of *distributive laws* [3] turned out to be somehow unsatisfactory. On the one hand, distributive laws do not exist in many relevant cases:

<sup>-</sup> Supported by the Ministero dell'Università e della Ricerca of Italy under Grant No. 201784YSZ5, PRIN2017 – ASPRA (*Analysis of Program Analyses*).

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 102–123, 2021. https://doi.org/10.1007/978-3-030-71995-1\_6

see [28,41] for some no-go theorems; on the other hand, proving their existence is error-prone: see [28] for a list of results that were mistakenly assuming the existence of a distributive law of the powerset monad over itself.

Nevertheless, some sort of weakening of the notion of distributive law–e.g., distributive laws of functors over monads [26]–proved to be ubiquitous in computer science: they are GSOS specifications [38], they are sound coinductive up-to techniques [7] and complete abstract domains [5]. In this paper we will exploit *weak distributive laws* in the sense of [15] that have been recently shown successful in composing the monads for nondeterminism and probability [17].

The goal of this paper is to somehow combine the monads P<sup>f</sup> and S mentioned above. Our interest in S relies on the wide expressiveness provided by the possibility of varying S: for instance by taking S to be the Boolean semiring, one obtains the monad P<sup>f</sup> ; by fixing S to be the field of reals, coalgebras over EM(S) turn out be linear dynamical systems [34].

We proceed as follows. Rather than composing P<sup>f</sup> , we found it convenient to compose the *full*, not necessarily finite, powerset monad P with S. In this way we can reuse several results in [12] that provide necessary and sufficient conditions on the semiring S for the existence of a canonical weak [15] distributive law δ : SP → PS. Our first contribution (Theorem 21) consists in showing that such δ has a convenient alternative characterisation, whenever the underlying semiring is a *positive semifield*, a condition that is met, e.g., by the semirings of Booleans and non-negative reals.

Such characterisation allows us to give a handy definition of the *canonical weak lifting* of <sup>P</sup> over EM(S) (Theorem 24) and to observe that such lifting is *almost* the same as the monad <sup>C</sup> : EM(S) <sup>→</sup> EM(S) defined by Jacobs in [25] (Remark 25): the only difference is the absence in C of the empty subset. Such difference becomes crucial when considering the composed monads, named CM: <sup>S</sup>et <sup>→</sup> <sup>S</sup>et in [25] and <sup>P</sup><sup>c</sup><sup>S</sup> : <sup>S</sup>et <sup>→</sup> <sup>S</sup>et in this paper: the latter maps a set X into the set of convex subsets of SX, while the former additionally requires the subsets to be non-empty. It turns out that while <sup>K</sup>l(CM) is not CPPO-enriched, a necessary condition for the coalgebraic framework in [20], <sup>K</sup>l(P<sup>c</sup>S) indeed is (Theorem 30).

Composing monads by means of weak distributive laws is rewarding in many respects: here we exploit the fact that algebras for the composed monad P<sup>c</sup>S coincide with δ-algebras, namely algebras for both P and S satisfying a certain pentagonal law. One can extract from this law some distributivity axioms that, together with the axioms for semimodules (algebras for the monad S) and those for complete semilattices (algebras for the monad P), provide an algebraic theory presenting the monad P<sup>c</sup>S (Theorem 32).

We conclude by coming back to the finite powerset monad P<sup>f</sup> . By replacing, in the above theory, complete semilattices with semilattices with bottom (algebras for the monad P<sup>f</sup> ) one obtains a theory presenting the monad Pf cS of *finitely generated* convex subsets (Theorem 35), which is formally defined as a restriction of the canonical P<sup>c</sup>S. The theory, displayed in Table 1, consists of the

Table 1. The sets of axioms ESL for semilattices (left), ELSM for S-semimodules (right) and EDfor their distributivity (bottom).


theory presenting the monad P<sup>f</sup> and the theory presenting the monad S with four distributivity axioms.

To save space we had to omit most of the proofs of the results in this article: the interested reader can find them in [9].

Notation. We assume the reader to be familiar with monads and their maps. Given a monad (M,η<sup>M</sup>, μ<sup>M</sup>) on C, EM(M) and Kl(M) denote, respectively, the Eilenberg-Moore category and the Kleisli category of M. The latter is defined as the category whose objects are the same as <sup>C</sup> and a morphism <sup>f</sup> : <sup>X</sup> <sup>→</sup> <sup>Y</sup> in <sup>K</sup>l(M) is a morphism <sup>f</sup> : <sup>X</sup> <sup>→</sup> <sup>M</sup>(<sup>Y</sup> ) in <sup>C</sup>. We write <sup>U</sup><sup>M</sup> : EM(M) <sup>→</sup> <sup>C</sup> and <sup>U</sup><sup>M</sup> : <sup>K</sup>l(M) <sup>→</sup> <sup>C</sup> for the canonical forgetful functors, and <sup>F</sup> <sup>M</sup> : <sup>C</sup> <sup>→</sup> EM(M), <sup>F</sup><sup>M</sup> : <sup>C</sup> <sup>→</sup> <sup>K</sup>l(M) for their respective left adjoints. Recall, in particular, that F <sup>M</sup>(X)=(X, μ<sup>M</sup> <sup>X</sup> ) and, for <sup>f</sup> : <sup>X</sup> <sup>→</sup> <sup>Y</sup> , <sup>F</sup> <sup>M</sup>(f) = <sup>M</sup>(f). Given <sup>n</sup> a natural number, we denote by n the set {1,...,n}.

### 2 (Weak) Distributive laws

Given two monads S and T on a category C, is there a way to compose them to form a new monad ST on C? This question was answered by Beck [3] and his theory of *distributive laws*, which are natural transformations δ : T S → ST satisfying four axioms and that provide a canonical way to endow the composite functor ST with a monad structure. We begin by recalling the classic definition. In the following, let (T,η<sup>T</sup> , μ<sup>T</sup> ) and (S, η<sup>S</sup>, μ<sup>S</sup>) be two monads on a category C.

Definition 1. *A* distributive law *of the monad* S *over the monad* T *is a natural transformation* δ : T S → ST *such that the following diagrams commute.*

One important result of Beck's theory is the bijective correspondence between distributive laws, liftings to Eilenberg-Moore algebras and extensions to Kleisli categories, in the following sense.

Definition 2. *A* lifting *of the monad* S *to* EM(T) *is a monad* (S,η ˜ <sup>S</sup>˜ , μ<sup>S</sup>˜ ) *where*

$$\begin{array}{c} \mathsf{EM}(T) \xrightarrow{\mathcal{S}} \mathsf{EM}(T) \\ \mathsf{F}^{T} \xrightarrow{\bullet} \begin{array}{c} \mathsf{EL}(T) \\ \mathsf{C} \xrightarrow{\mathcal{S}} \end{array} \end{array} \begin{array}{c} \mathsf{EM}(T) \\ \mathsf{C}^{T} \end{array} \begin{array}{c} \text{commutes}, \quad \begin{array}{c} U^{T} \eta^{\mathcal{S}} = \eta^{\mathcal{S}} U^{T}, \quad \ U^{T} \mu^{\mathcal{S}} = \mu^{\mathcal{S}} U^{T}. \end{array} \end{array}$$

*An* extension *of the monad* T *to* Kl(S) *is a monad* (T,η ˜ <sup>T</sup>˜ , μ<sup>T</sup>˜ ) *such that*

$$\begin{array}{c} \mathbb{C} \xrightarrow{\mathbf{C}} \xrightarrow{T} \begin{array}{c} \mathbb{C} \\ \downarrow \\ \mathbb{K}l(S) \xrightarrow{\tilde{T}} \mathbb{K}l(S) \end{array} \xrightarrow{\text{commutes}} \mathbb{K}l(S) \end{array} , \quad \begin{array}{c} \begin{array}{c} \text{commutes}, \quad \eta^{\tilde{T}}F\_{S} = F\_{S}\eta^{T}, \quad \mu^{\tilde{T}}F\_{S} = F\_{S}\mu^{T}. \\\\ \mu^{\tilde{T}}F\_{S} = F\_{S}\mu^{T}. \end{array} \end{array}$$

Böhm [11] and Street [37] have studied various weaker notions of distributive law; here we shall use the one that consists in dropping the axiom involving η<sup>T</sup> in Definition 1, following the approach of Garner [15].

Definition 3. *A* weak distributive law of S over T *is a natural transformation* <sup>δ</sup> : T S <sup>→</sup> ST *such that the diagrams in* (1) *regarding* <sup>μ</sup><sup>S</sup>*,* <sup>μ</sup><sup>T</sup> *and* <sup>η</sup><sup>S</sup> *commute.*

There are suitable weaker notions of liftings and extensions which also bijectively correspond to weak distributive laws as proved in [11,15].

Definition 4. *A* weak lifting *of* S *to* EM(T) *consists of a monad* (S,η ˜ <sup>S</sup>˜ , μ<sup>S</sup>˜ ) *on* EM(T) *and two natural transformations*

$$U^T \mathring{S} \stackrel{\iota}{\longrightarrow} SU^T \stackrel{\pi}{\longrightarrow} U^T \mathring{S}$$

*such that* πι = id<sup>U</sup> <sup>T</sup> <sup>S</sup>˜ *and such that the following diagrams commute:*

*<sup>A</sup>* weak extension *of* <sup>T</sup> *to* <sup>K</sup>l(S) *is a functor* <sup>T</sup>˜ : <sup>K</sup>l(S) <sup>→</sup> <sup>K</sup>l(S) *together with a natural transformation* <sup>μ</sup><sup>T</sup>˜ : <sup>T</sup>˜T˜ <sup>→</sup> <sup>T</sup>˜ *such that* <sup>F</sup>S<sup>T</sup> <sup>=</sup> T F˜ <sup>S</sup> *and* <sup>μ</sup><sup>T</sup>˜ F<sup>S</sup> = FSμ<sup>T</sup> *.*

Theorem 5 ([3,11,15]). *There is a bijective correspondence between (weak) distributive laws* T S <sup>→</sup> ST*, (weak) liftings of* <sup>S</sup> *to* EM(T) *and (weak) extensions of* T *to* Kl(S)*.*

### 3 The Powerset and Semimodule Monads

The Monad *P*. Let us now consider, as S, the *powerset* monad (P, η<sup>P</sup> , μ<sup>P</sup> ), where η<sup>P</sup> <sup>X</sup>(x) = {x} and μ<sup>P</sup> <sup>X</sup>(U) = <sup>U</sup>∈U <sup>U</sup>. Its algebras are precisely the complete semilattices and we have that <sup>K</sup>l(P) is isomorphic to the category <sup>R</sup>el of sets and relations. Hence, giving a distributive law TP→PT is the same as giving an extension of T to Rel: for this to happen the notion of weak cartesian functor and natural transformation is crucial.

Definition 6. *A functor* <sup>T</sup> : <sup>S</sup>et <sup>→</sup> <sup>S</sup>et *is said to be* weakly cartesian *if and only if it preserves weak pullbacks. A natural transformation* ϕ: F → G *is said to be* weakly cartesian *if and only if its naturality squares are weak pullbacks.*

Kurz and Velebil [29] proved, using an original argument of Barr [2], that an endofunctor T on Set has at most one extension to Rel and this happens precisely when it is weakly cartesian; similarly a natural transformation ϕ: F → G, with <sup>F</sup> and <sup>G</sup> weakly cartesian, has at most one extension <sup>ϕ</sup>˜: <sup>F</sup>˜ <sup>→</sup> <sup>G</sup>˜, precisely when it is weakly cartesian. The following result is therefore immediate.

Proposition 7 ([15, Corollary 16]). *For any monad* (T,η<sup>T</sup> , μ<sup>T</sup> ) *on* Set*:*


The Monad *S*. Recall that a *semiring* is a tuple (S, +, ·, 0, 1) such that (S, +, 0) is a commutative monoid, (S, ·, 1) is a monoid, · distributes over + and 0 is an annihilating element for ·. In other words, a semiring is a ring where not every element has an additive inverse. Natural numbers N with the usual operations of addition and multiplication form a semiring. Similarly, integers, rationals and reals form semirings. Also the Booleans <sup>B</sup>ool = {0, <sup>1</sup>} with <sup>∨</sup> and <sup>∧</sup> acting as <sup>+</sup> and ·, respectively, form a semiring.

Every semiring <sup>S</sup> generates a *semimodule* monad <sup>S</sup> on <sup>S</sup>et as follows. Given a set X, S(X) = {ϕ: X → S | *supp* ϕ finite}, where *supp* ϕ = {x ∈ X | ϕ(x) = 0}. For f : X → Y , define for all ϕ ∈ S(X)

$$\mathcal{S}(f)(\varphi) = \left( y \mapsto \sum\_{x \in f^{-1}\{y\}} \varphi(x) \right) \colon Y \to S.$$

This makes S a functor. The unit η<sup>S</sup> <sup>X</sup> : X → S(X) is given by η<sup>S</sup> <sup>X</sup>(x) = Δx, where Δ<sup>x</sup> is the Dirac function centred in x, while the multiplication μ<sup>S</sup> <sup>X</sup> : <sup>S</sup><sup>2</sup>(X) <sup>→</sup> <sup>S</sup>(X) is defined for all <sup>Ψ</sup> ∈ S<sup>2</sup>(X) as

$$\mu\_X^{\mathcal{S}}(\Psi) = \left( x \mapsto \sum\_{\varphi \in \operatorname{supp} \Psi} \Psi(\varphi) \cdot \varphi(x) \right) \colon X \to S.$$


Table 2. Definition of some properties of a semiring S. Here a, b, c, d ∈ S.

An algebra for S is precisely a *left-S-semimodule*, namely a set X equipped with a binary operation +, an element 0 and a unary operation λ· for each λ ∈ S, satisfying the equations in Table 1. Indeed, if X carries a semimodule structure then one can define a map a: SX → X as, for ϕ ∈ SX,

$$a(\varphi) = \sum\_{x \in X} \varphi(x) \cdot x \tag{4}$$

where the above sum is finite because so is *supp* ϕ. Vice versa, if (X, a) is an S-algebra, then the corresponding left-semimodule structure on X is obtained by defining for all λ ∈ S and x, y ∈ X

$$x +^a y = a(x \mapsto 1, y \mapsto 1), \qquad 0^a = a(\varepsilon), \qquad \lambda \cdot^a x = a(x \mapsto \lambda). \tag{5}$$

Above and in the remainder of the paper, we write the list (x<sup>1</sup> → s1,...,x<sup>n</sup> → sn) for the only function ϕ: X → S with support {x1,...,x<sup>n</sup>} mapping x<sup>i</sup> to s<sup>i</sup> and we write the empty list ε for the function constant to 0. For instance, for a = μ<sup>S</sup> <sup>X</sup> : SSX → SX, the left-semimodule structure is defined for all ϕ1, ϕ<sup>2</sup> ∈ SX and x ∈ X as

$$(\varphi\_1 + \overset{\mu}{\rightsquigarrow} \varphi\_2)(x) = \varphi\_1(x) + \varphi\_2(x), \qquad 0^{\mu \presplus}(x) = 0, \qquad (\lambda \cdot ^{\mu \presplus} \varphi\_1)(x) = \lambda \cdot \varphi\_1(x).$$

Proposition 7 tells us exactly when a (weak) distributive law of the form <sup>T</sup>P→P<sup>T</sup> exists for an arbitrary monad <sup>T</sup> on <sup>S</sup>et. Take then <sup>T</sup> <sup>=</sup> <sup>S</sup>: when are the functor S and the natural transformations η<sup>S</sup> and μ<sup>S</sup> weakly cartesian? The answer has been given in [12] (see also [18]), where a complete characterisation in purely algebraic properties for S is provided. In Table 2 we recall such properties.

#### Theorem 8 ([12]). *Let* S *be a semiring.*


*Remark 9.* In [12, Proposition 9.1] it is proved that if S enjoys (C) and (D), then S is refinable; if S is a positive semifield, then it enjoys (B) and (E). In the next Proposition we prove that if S is a positive semifield then it is also refinable, hence S and μ<sup>S</sup> are weakly cartesian.

Proposition 10. *If* S *is a positive semifield, then it is refinable.*

*Proof.* Let a, b, c and d in S be such that a + b = c + d. If a + b = 0, then take x = y = z = t = 0, otherwise take

$$x = \frac{ac}{c+d}, \quad y = \frac{ad}{c+d}, \quad z = \frac{bc}{c+d}, \quad t = \frac{bd}{c+d}.$$

Then x + y = a, z + t = b, x + z = c, y + t = d.

*Example 11.* It is known that, for <sup>S</sup> <sup>=</sup> <sup>N</sup>, a distributive law <sup>δ</sup> : SP → PS exists. Indeed one can check that all conditions of Theorem 8 are satisfied, therefore we can apply Proposition 7.1. In this case, the monad SX is naturally isomorphic to the commutative monoid monad, which given a set X returns the collection of all *multisets* of elements of X. The law δ is well known (see e.g. [15,23]): given a multiset A1,...,A<sup>n</sup> of subsets of X in SPX, where the Ai's need not be distinct, it returns the set of multisets {a1,...,a<sup>n</sup> | a<sup>i</sup> ∈ A<sup>i</sup>}.

Convex Subsets of Left-semimodules. Theorem 8 together with Proposition 7.1 tell us that whenever the element 1 of S can be decomposed as a non-trivial sum there is no distributive law δ : SP → PS. Semirings with this property abound, for example Q, R, R<sup>+</sup> with the usual operations of sum an multiplication, as well as <sup>B</sup>ool (since <sup>1</sup> <sup>∨</sup> 1=1). Such semirings are precisely those for which the notion of *convex subset* of their left-semimodules is nontrivial. For the existence of a *weak* distributive law, however, this condition on 1<sup>S</sup> is not required: convexity will indeed play a crucial role in the definition of the weak distributive law.

Definition 12. *Let* S *be a semiring,* X *an* S*-left-semimodule and* A ⊆ X*. The* convex closure *of* A *is the set*

$$\overline{A} = \left\{ \sum\_{i=1}^{n} \lambda\_i \cdot a\_i \mid n \in \mathbb{N}, \ a\_i \in A, \sum\_{i=1}^{n} \lambda\_i = 1 \right\} \subseteq X.$$

*The set* A *is said to be* convex *if and only if* A = A*.*

Recalling that the category of <sup>S</sup>-left-semimodules is isomorphic to EM(S), we can use (4) to translate Definition 12 of convex subset of a semimodule into the following notion of convex subset of a S-algebra a: SX → X.

Definition 13. *Let* <sup>S</sup> *be a semiring,* (X, a) <sup>∈</sup> EM(S)*,* <sup>A</sup> <sup>⊆</sup> <sup>X</sup>*. The* convex closure *of* A *in* (X, a) *is the set*

$$\overline{A}^a = \left\{ a(\varphi) \mid \varphi \in \mathcal{S}X, \operatorname{supp} \varphi \subseteq A, \sum\_{x \in X} \varphi(x) = 1 \right\}.$$

<sup>A</sup> *is said to be* convex *in* (X, a) *if and only if* <sup>A</sup> <sup>=</sup> <sup>A</sup><sup>a</sup> *. We denote by* <sup>P</sup><sup>a</sup> <sup>c</sup> X *the set of convex subsets of* X *with respect to* a*.*

*Remark 14.* Observe that ∅ is convex, because ∅ a = ∅, since there is no ϕ ∈ SX with empty support such that - <sup>x</sup>∈<sup>X</sup> <sup>ϕ</sup>(x)=1.

*Example 15.* Suppose S is such that η<sup>S</sup> is weakly cartesian (equivalently (A) holds: <sup>x</sup> <sup>+</sup> <sup>y</sup> =1 =<sup>⇒</sup> <sup>x</sup> = 0 or <sup>y</sup> = 0), for example <sup>S</sup> <sup>=</sup> <sup>N</sup>, and let (X, a) <sup>∈</sup> EM(S). A <sup>ϕ</sup> ∈ S<sup>X</sup> such that - <sup>x</sup>∈<sup>X</sup> <sup>ϕ</sup>(x)=1 and *supp* <sup>ϕ</sup> <sup>⊆</sup> <sup>A</sup> is a function that assigns 1 to *exactly one* element of A and 0 to all the other elements of X. These functions are precisely all the Δ<sup>x</sup> for those elements x ∈ A. Since a: SX → X is a structure map for an S-algebra, it maps the function Δ<sup>x</sup> into x. Therefore <sup>A</sup><sup>a</sup> <sup>=</sup> {a(Δx) <sup>|</sup> <sup>x</sup> <sup>∈</sup> <sup>A</sup>} <sup>=</sup> {<sup>x</sup> <sup>|</sup> <sup>x</sup> <sup>∈</sup> <sup>A</sup>} <sup>=</sup> <sup>A</sup>. Thus *all* <sup>A</sup> ∈ PS<sup>X</sup> are convex.

*Example 16.* When <sup>S</sup> <sup>=</sup> <sup>B</sup>ool, we have that <sup>S</sup> is naturally isomorphic to <sup>P</sup><sup>f</sup> , the finite powerset monad, whose algebras are idempotent commutative monoids or equivalently semilattices with a bottom element. So, for (X, a) <sup>∈</sup> EM(S), a ϕ ∈ SX such that - <sup>x</sup>∈<sup>X</sup> <sup>ϕ</sup>(x)=1 and *supp* <sup>ϕ</sup> <sup>⊆</sup> <sup>A</sup> is any finitely supported function from X to Bool that assigns 1 to at least one element of A. Intuitively, such a ϕ selects a non-empty finite subset of A, then a(ϕ) takes the join of all the selected elements. Thus, <sup>A</sup><sup>a</sup> adds to <sup>A</sup> all the possible joins of non-empty finite subsets of A: A is convex if and only if it is closed under binary joins.

### 4 The Weak Distributive Law *δ* **:** *SP → PS*

Weak extensions of <sup>S</sup> to <sup>K</sup>l(P) = <sup>R</sup>el only consist of extensions of the functor S and of the multiplication μ<sup>S</sup> , for which necessary and sufficient conditions are listed in Theorem 8. Hence for semirings S satisfying those criteria a weak distributive law δ : SP → PS does exist, and it is unique because there is only one extension of the functor <sup>S</sup> to <sup>R</sup>el.

Theorem 17. *Let* S *be a positive, refinable semiring satisfying* (B) *and* (E) *in Table 2. Then there exists a unique weak distributive law* δ : SP → PS *defined for all sets* X *and* Φ ∈ SPX *as:*

$$\delta\_X(\Phi) = \left\{ \varphi \in \mathcal{S}X \mid \exists \psi \in \mathcal{S}(\ominus\_X) . \begin{cases} \forall A \in \mathcal{P}X.\,\Phi(A) = \sum\_{x \in A} \psi(A,x) & (a) \\ \forall x \in X.\,\varphi(x) = \sum\_{A \ni x} \psi(A,x) & (b) \end{cases} \right\} \tag{6}$$

*where* \$<sup>X</sup> *is the set* {(A, x) ∈ PX × X | x ∈ A}*.*

The above δ, which is obtained by following the standard recipe of Proposition 7, is illustrated by the following example.

*Example 18.* Take S = R<sup>+</sup> with the usual operations of sum and multiplication. Consider X = {x, y, z, a, b}, A<sup>1</sup> = {x, y}, A<sup>2</sup> = {y, z} and A<sup>3</sup> = {a, b}. Let Φ ∈ S(PX) be defined as

$$\Phi = (A\_1 \mapsto 5, \quad A\_2 \mapsto 9, \quad A\_3 \mapsto 13).$$

and Φ(A)=0 for all other sets A ⊆ X, so *supp* Φ = {A1, A2, A3}. In order to find an element ϕ ∈ δX(Φ), we can first take a ψ ∈ S(\$<sup>X</sup>) satisfying condition (a) in (6) and then compute the ϕ ∈ SX using condition (b).

Among the ψ ∈ S(\$<sup>X</sup>), consider for instance the following:

$$
\psi = \begin{pmatrix} (A\_1, x) \mapsto 2 & (A\_2, y) \mapsto 4 & (A\_3, a) \mapsto 6 \\ (A\_1, y) \mapsto 3 & (A\_2, z) \mapsto 5 & (A\_3, b) \mapsto 7 \end{pmatrix} \cdot \varphi
$$

Since Φ(A1) = ψ(A1, x) + ψ(A1, y), Φ(A2) = ψ(A2, y) + ψ(A2, z) and Φ(A3) = ψ(A3, a) + ψ(A3, b), we have that ψ satisfies condition (a) in (6). Condition (b) forces ϕ to be the following:

$$
\varphi = (x \mapsto 2, \quad y \mapsto 3 + 4, \quad z \mapsto 5, \quad a \mapsto 6, \quad b \mapsto 7).
$$

*Remark 19.* If S enjoys (A) in Table 2, then the transformation δ given in (6) is actually a distributive law, and for S = N we recover the well-known δ of Example 11. Example 18 can be repeated with S = N: then Φ is the multiset where the set A<sup>1</sup> occurs five times, A<sup>2</sup> nine times and A<sup>3</sup> thirteen times. The elements of δX(Φ) are all those multisets containing one element per copy of A1, A<sup>2</sup> and A<sup>3</sup> in *supp* Φ. The ϕ provided indeed contains five elements of A<sup>1</sup> (two copies of x and three of y), nine elements of A<sup>2</sup> (four copies of y and five of z), thirteen elements of A<sup>3</sup> (six copies of a and seven of b).

As Example 18 shows, each element ϕ of δX(Φ) is determined by a function <sup>ψ</sup> choosing for each set <sup>A</sup> <sup>∈</sup> *supp* <sup>Φ</sup> a finite number of elements <sup>x</sup><sup>A</sup> <sup>1</sup> ,...,x<sup>A</sup> <sup>m</sup> in A and s<sup>A</sup> <sup>1</sup> ,...,s<sup>A</sup> <sup>m</sup> in S in such a way that m <sup>j</sup>=1 <sup>s</sup><sup>A</sup> <sup>j</sup> = Φ(A). The function ϕ maps each x<sup>A</sup> <sup>j</sup> to s<sup>A</sup> <sup>j</sup> if the sets in *supp* Φ are *disjoint*; if however there are x<sup>A</sup> <sup>j</sup> and x<sup>B</sup> k such that x<sup>A</sup> <sup>j</sup> = x<sup>B</sup> <sup>k</sup> (like y in Example 18), then x<sup>A</sup> <sup>j</sup> is mapped to s<sup>A</sup> <sup>j</sup> + s<sup>B</sup> k .

Among those ψ's, there are some special, *minimal* ones as it were, that choose for each A in *supp* Φ exactly *one* element of A, and assign to it Φ(A). The induced ϕ in δX(Φ) can be described as - <sup>A</sup>∈u−1{x} <sup>Φ</sup>(A) (equivalently <sup>S</sup>(u)(Φ)<sup>1</sup>) where u: *supp* Φ → X is a function selecting an element of A for each A ∈ *supp* Φ (that is u(A) ∈ A). We denote the set of such ϕ's by c(Φ).

$$\mathfrak{c}(\Phi) = \{ \mathcal{S}(u)(\Phi) \mid u\text{: } \operatorname{supp} \Phi \to X \text{ such that } \forall A \in \operatorname{supp} \Phi. \, u(A) \in A \}\tag{7}$$

*Example 20.* Take X, A<sup>1</sup> and A<sup>2</sup> as in Example 18, but a different, smaller, Φ ∈ S(PX) defined as Φ = (A<sup>1</sup> → 1, A<sup>2</sup> → 2). There are only four functions u: *supp* Φ → X such that u(A) ∈ A and thus only four functions ϕ in c(Φ):

$$\begin{array}{l} u\_1 = \left( A\_1 \mapsto x, \quad A\_2 \mapsto y \right) \\ u\_2 = \left( A\_1 \mapsto x, \quad A\_2 \mapsto z \right) \\ u\_3 = \left( A\_1 \mapsto y, \quad A\_2 \mapsto y \right) \\ u\_4 = \left( A\_1 \mapsto y, \quad A\_2 \mapsto z \right) \end{array} \quad \begin{array}{l} \varphi\_1 = \left( x \mapsto 1, \ y \mapsto 2 \right) \\ \varphi\_2 = \left( x \mapsto 1, \ z \mapsto 2 \right) \\ \varphi\_3 = \left( y \mapsto 3 \right) \\ \varphi\_4 = \left( y \mapsto 1, \ z \mapsto 2 \right) \end{array}$$

Observe that the function ϕ = (x → 1, y → 1, z → 1) belongs to δX(Φ) but not to c(Φ). Nevertheless ϕ can be retrieved as the convex combination <sup>1</sup> <sup>2</sup> ·ϕ1<sup>+</sup> <sup>1</sup> <sup>2</sup> ·ϕ2.

<sup>1</sup> More precisely, we should write <sup>S</sup>(u)(Φ ) where Φ is the restriction of Φ to supp Φ.

Our key result states that every ϕ ∈ δX(Φ) can be written as a convex combination (performed in the S-algebra (SX, μ<sup>S</sup> <sup>X</sup>)) of functions in c(Φ), at least when S is a positive semifield, which by Remark 9 and Proposition 10 satisfies all the conditions that make (6) a weak distributive law. The proof is laborious and omitted here: we only remark that divisions in S play a crucial role in it.

Theorem 21. *Let* S *be a positive semifield. Then for all sets* X *and* Φ ∈ SPX

$$\delta\_X(\Phi) = \left\{ \mu\_X^{\mathcal{S}}(\Psi) \mid \Psi \in \mathcal{S}^2 X. \sum\_{\varphi \in \mathcal{S}X} \Psi(\varphi) = 1, \sup \Psi \subseteq \mathfrak{c}(\Phi) \right\} = \overline{\mathfrak{c}(\Phi)}^{\mu\_X^{\mathcal{S}}}. \tag{8}$$

*Remark 22.* If we drop the hypothesis of semifield and only have the minimal assumptions of Theorem 17, then (8) does not hold any more: S = N is a counterexample. Indeed, in this case every subset of SX is convex with respect to μ<sup>S</sup> <sup>X</sup> (see Example 15), therefore we would have δX(Φ) = c(Φ), which is false: the function ϕ of Example 18 is an example of an element in δX(Φ) \ c(Φ).

*Remark 23.* When <sup>S</sup> <sup>=</sup> <sup>B</sup>ool (which is a positive semifield), the monad <sup>S</sup> coincides with the monad P<sup>f</sup> . The function c(·) in (7) can then be described as

$$\mathfrak{c}(\mathcal{A}) = \{ \mathcal{P}\_f(u)(\mathcal{A}) \mid u \colon \mathcal{A} \to X \text{ such that } \forall A \in \mathcal{A}. \, u(A) \in A \}$$

for all A∈P<sup>f</sup>PX. It is worth remarking that this is the transformation χ appearing in Example 9 of [27] (which is in turn equivalent to the one in Example 2.4.7 of [31]). This transformation was erroneously supposed to be a distributive law, as it fails to be natural (see [28]). However, by taking its convex closure, as displayed in (8), one can turn it into a *weak* distributive law.

# 5 The Weak Lifting of *<sup>P</sup>* to EM**(***S***)**

By exploiting the characterisation of the weak distributive law δ (Theorem 21), we can now describe the weak lifting of <sup>P</sup> to EM(S) generated by <sup>δ</sup>.

Recall from Definition 13 that <sup>P</sup><sup>a</sup> <sup>c</sup> X is the set of convex subsets of X with respect to the <sup>S</sup>-algebra <sup>a</sup>: <sup>S</sup><sup>X</sup> <sup>→</sup> <sup>X</sup>. The functions <sup>ι</sup>(X,a) : <sup>P</sup><sup>a</sup> <sup>c</sup> X → PX and <sup>π</sup>(X,a) : <sup>P</sup><sup>X</sup> → P<sup>a</sup> <sup>c</sup> <sup>X</sup> are defined for all <sup>A</sup> ∈ P<sup>a</sup> <sup>c</sup> X and B ∈ PX as

$$
\mu\_{(X,a)}(A) = A \qquad \text{and} \qquad \pi\_{(X,a)}(B) = \overline{B}^a,\tag{9}
$$

that is ι(X,a) is just the obvious set inclusion and π(X,a) performs the convex closure in <sup>a</sup>. The function <sup>α</sup><sup>a</sup> : SP<sup>a</sup> <sup>c</sup> <sup>X</sup> → P<sup>a</sup> <sup>c</sup> <sup>X</sup> is defined for all <sup>Φ</sup> ∈ SP<sup>a</sup> <sup>c</sup> X as

$$\alpha\_a(\Phi) = \{a(\varphi) \: \mid \: \varphi \in \mathfrak{c}(\Phi)\}. \tag{10}$$

To be completely formal, above we should have written c(S(ι)(Φ)) in place of c(Φ), but it is immediate to see that the two sets coincide. Proving that <sup>α</sup><sup>a</sup> : SP<sup>a</sup> <sup>c</sup> <sup>X</sup> → P<sup>a</sup> <sup>c</sup> X is well defined (namely, αa(Φ) is a convex set) and forms an S-algebra requires some ingenuity and will be shown later in Section 5.1. The

assignment (X, a) <sup>→</sup> (P<sup>a</sup> <sup>c</sup> X, αa) gives rise to a functor <sup>P</sup>˜ : EM(S) <sup>→</sup> EM(S) defined on morphisms f : (X, a) → (X , a ) as

$$
\tilde{\mathcal{P}}(f)(A) = \mathcal{P}f(A) \tag{11}
$$

for all <sup>A</sup> ∈ P<sup>a</sup> <sup>c</sup> <sup>X</sup>. For all (X, a) in EM(S), <sup>η</sup>P˜ (X,a) : (X, a) <sup>→</sup> <sup>P</sup>˜(X, a) and μP˜ (X,a) : <sup>P</sup>˜P˜(X, a) <sup>→</sup> <sup>P</sup>˜(X, a) are defined for <sup>x</sup> <sup>∈</sup> <sup>X</sup> and A∈P<sup>α</sup><sup>a</sup> <sup>c</sup> (P<sup>a</sup> <sup>c</sup> X) as

$$\eta^{\mathcal{P}}\_{(X,a)}(x) = \{x\} \qquad \text{and} \qquad \mu^{\mathcal{P}}\_{(X,a)}(\mathcal{A}) = \bigcup\_{A \in \mathcal{A}} A. \tag{12}$$

Theorem 24. *Let* S *be a positive semifield. Then the canonical weak lifting of the powerset monad* <sup>P</sup> *to* EM(S)*, determined by* (8)*, consists of the monad* (P˜, ηP˜ , μP˜ ) *on* EM(S) *defined as in* (10)*,* (11)*,* (12) *and the natural transformations* <sup>ι</sup>: <sup>U</sup> <sup>S</sup> P→P ˜ <sup>U</sup> <sup>S</sup> *and* <sup>π</sup> : <sup>P</sup><sup>U</sup> <sup>S</sup> <sup>→</sup> <sup>U</sup> <sup>S</sup> <sup>P</sup>˜ *defined as in* (9)*.*

It is worth spelling out the left-semimodule structure on <sup>P</sup><sup>a</sup> <sup>c</sup> X corresponding to the <sup>S</sup>-algebra <sup>α</sup><sup>a</sup> : SP<sup>a</sup> <sup>c</sup> <sup>X</sup> → P<sup>a</sup> <sup>c</sup> X. Let us start with λ· <sup>α</sup><sup>a</sup> <sup>A</sup> for some <sup>A</sup> ∈ P<sup>a</sup> <sup>c</sup> X. By (5), λ · <sup>α</sup><sup>a</sup> <sup>A</sup> <sup>=</sup> <sup>α</sup>a(Φ) where <sup>Φ</sup> = (<sup>A</sup> <sup>→</sup> <sup>λ</sup>). By (10), <sup>α</sup>a(Φ) = {a(ϕ) <sup>|</sup> ϕ ∈ c(Φ)}. Following the definition of c(Φ) given in (7), one has to consider functions u: *supp* Φ → X such that u(B) ∈ B for all B ∈ *supp* Φ: if λ = 0, then *supp* Φ = {A} and thus, for each x ∈ A, there is exactly one function u<sup>x</sup> : *supp* Φ → X mapping A into x. It is immediate to see that S(ux)(Φ) is exactly the function (x → λ) and thus a(S(ux)(Φ)) is, by (5), λ· <sup>a</sup> x. Now if λ = 0, then *supp* Φ = ∅, so there is *exactly one* function u: *supp* Φ → X and S(u)(Φ) is the function mapping all <sup>x</sup> <sup>∈</sup> <sup>X</sup> into <sup>0</sup> and thus, by (5), <sup>a</sup>(S(u)(Φ)) = 0<sup>a</sup>. Summarising,

$$\lambda \cdot^{\alpha\_a} A = \begin{cases} \{\lambda \cdot^a x \: \mid \: x \in A\} & \text{if } \lambda \neq 0\\ \{0^a\} & \text{if } \lambda = 0 \end{cases} \tag{13}$$

Following similar lines of thoughts, one can check that

$$A + ^{\alpha\_a}B = \{x + ^a y \: \mid \ x \in A, \ y \in B\} \qquad \text{and} \qquad 0^{\alpha\_a} = \{0^a\}.\tag{14}$$

*Remark 25.* By comparing (14) and (13) with (4) and (5) in [25], it is immediate to see that our monad <sup>P</sup>˜ coincides with a slight variation of Jacobs's convex powerset monad C, the only difference being that we do allow for ∅ to be in Pa <sup>c</sup> X. Jacobs insisted on the necessity of C(X) to be the set of *non-empty* convex subsets of X, because otherwise he was not able to define a semimodule structure on <sup>C</sup>(X) such that <sup>0</sup> · ∅ <sup>=</sup> {0<sup>a</sup>}. However, we do manage to do so, since by (13), <sup>0</sup> · <sup>A</sup> = 0<sup>a</sup> for all <sup>A</sup> and in particular for <sup>A</sup> <sup>=</sup> <sup>∅</sup>. At first sight, this may look like an ad-hoc solution, but this is not the case: it is intrinsic in the definition of the unique weak lifting of <sup>P</sup> to EM(S), as stated by Theorem 24 and shown next.

#### 5.1 Proof of Theorem 24

By Theorem 5, the weak distributive law (6) corresponds to a weak lifting <sup>P</sup>˜ of <sup>P</sup> to EM(S), which we are going to show coincides with the data of (9)-(12). The image along <sup>P</sup>˜ of a <sup>S</sup>-algebra (X, a) will be a set <sup>Y</sup> together with a structure map α<sup>a</sup> that makes it a S-algebra in turn. Garner [15, Proposition 13] gives us the recipe to build Y and α<sup>a</sup> appropriately. Y is obtained by splitting the following idempotent in Set:

$$e\_{(X,a)} = \mathcal{P}X \xrightarrow{\eta\_{\mathcal{P}X}^S} \mathcal{S}(\mathcal{P}X) \xrightarrow{\delta\_X} \mathcal{P}(\mathcal{S}X) \xrightarrow{\mathcal{P}\_a} \mathcal{P}X \tag{15}$$

as a composite e(X,a) = ι(X,a) ◦ π(X,a), where π(X,a) is the corestriction of e(X,a) to its image and ι(X,a) is the set-inclusion of the image of e(X,a) into PX. In other words, Y is the set of fixed points of e(X,a). α<sup>a</sup> is obtained as the composite

$$\alpha\_a = \, \, \mathcal{SY} \xrightarrow{\mathcal{S}\_{\mathcal{U}(X,a)}} \mathcal{SPX} \xrightarrow{\delta\_X} \mathcal{PSSX} \xrightarrow{\mathcal{P}\_a} \mathcal{PX} \xrightarrow{\pi\_{(X,a)}} \mathcal{Y}\_{\mathcal{X}}$$

Let us, then, fix an S-algebra (X, a). Given A ∈ PX, we have η<sup>S</sup> <sup>P</sup><sup>X</sup>(A) = Δ<sup>A</sup> : PX → S, the Dirac-function centred in A. The set δX(η<sup>S</sup> <sup>P</sup><sup>X</sup>(A)) has a simple description, shown in the next Lemma.

Lemma 26. *For all* A ∈ PX

$$\delta\_X(\eta^{\mathcal{S}}\_{\mathcal{P}X}(A)) = \left\{ \varphi \in \mathcal{S}X \mid \operatorname{supp} \varphi \subseteq A, \sum\_{x \in X} \varphi(x) = 1 \right\}.$$

The image along A of the idempotent e is therefore

$$\rho\_\*e(A) = \mathcal{P}a(\delta\_X(\eta\_{\mathcal{P}X}^{\mathcal{S}}(A))) = \left\{ a(\varphi) \mid \varphi \in \mathcal{S}X, \operatorname{supp}\varphi \subseteq A, \sum\_{x \in X} \varphi(x) = 1 \right\} = \overline{A}^a.$$

Hence the idempotent e computes the convex closure of elements of PX and its fixed points are precisely the convex subsets of X with respect to the structure map <sup>a</sup>. Therefore, the carrier set of <sup>P</sup>˜(X, a) is precisely <sup>P</sup><sup>a</sup> <sup>c</sup> X, the natural transformations π and ι are, respectively, the convex closure operator and the set-inclusion of <sup>P</sup><sup>a</sup> <sup>c</sup> X into PX as in (9).

Pa <sup>c</sup> <sup>X</sup> is then equipped with a structure map <sup>α</sup><sup>a</sup> : SP<sup>a</sup> <sup>c</sup> <sup>X</sup> → P<sup>a</sup> <sup>c</sup> X given by

$$\alpha\_a = \mathcal{SP}\_c^a X \xrightarrow{\mathcal{S}\_{\mathbb{L}(X,a)}} \mathcal{SPX} \xrightarrow{\delta\_X} \mathcal{PSSX} \xrightarrow{\mathcal{P}\_{a}} \mathcal{PX} \xrightarrow{\pi\_{(X,a)}} \mathcal{P}\_c^a X.$$

Let us try to calculate <sup>α</sup>a: given <sup>Φ</sup>: <sup>P</sup><sup>a</sup> <sup>c</sup> X → S with finite support, we have that S(ι(X,a))(Φ) is just the extension of Φ to PX which assigns 0 to each non-convex subset of X. If we write ι instead of ι(X,a) for short, we have

$$\alpha\_a(\Phi) = \overline{\mathcal{P}a(\delta\_X(\mathcal{S}(\iota)(\Phi)))}^a. \tag{16}$$

Next, we can use the following technical result.

Proposition 27. *Let* (X, a) *be a* S*-algebra. If* A *is a convex subset of* (SX, μ<sup>S</sup> <sup>X</sup>)*, then* Pa(A) *is convex in* (X, a)*.*

Since δX(Φ ) is the convex closure of c(Φ ) in (SX, μ<sup>S</sup> <sup>X</sup>) for every Φ ∈ SPX, by Proposition 27 we can avoid to perform the a-convex closure in (16). Therefore

$$\alpha\_a(\Phi) = \mathcal{P}a(\delta\_X(\mathcal{S}(\iota)(\Phi))) = \mathcal{P}a(\overline{\mathfrak{c}(\mathcal{S}(\iota)(\Phi)})^{\mu\_X^{\mathcal{S}}}).$$

In the next Proposition we show that also the μ<sup>S</sup> <sup>X</sup>-convex closure is superfluous, due to the fact that <sup>Φ</sup> ∈ SP<sup>a</sup> <sup>c</sup> X (and not simply SPX), thus obtaining (10).

Proposition 28. *Let* <sup>S</sup> *be a positive semifield,* (X, a) *<sup>a</sup>* <sup>S</sup>*-algebra,* <sup>Φ</sup> ∈ SP<sup>a</sup> <sup>c</sup> X*. Then* Pa(δX(S(ι)(Φ))) = Pa(c(S(ι)(Φ)))*.*

*Proof.* In this proof we shall simply write Φ instead of the more verbose S(ι)(Φ). We want to prove that

$$\begin{cases} \mathcal{P}a\left(\delta\_X(\Phi)\right) = \\ a(\psi) \mid \psi \in \mathcal{S}X. \exists u \colon \operatorname{supp} \Phi \to X. u(A) \in A, \forall x \in X. \psi(x) = \sum\_{\substack{A \in \operatorname{supp} \Phi \\ u(A) = x}} \Phi(A) \end{cases} (17)$$

where we have, by Theorem 21, that

$$\mathcal{P}a\left(\delta\_X(\Phi)\right) = \{a(\mu\_X^{\mathcal{S}}(\Psi)) \mid \Psi \in \mathcal{S}^2X, \sum\_{\varphi \in \mathcal{S}X} \Psi(\varphi) = 1, supp \,\Psi \subseteq \mathfrak{c}(\Phi)\}.$$

First of all, ∅ is *not* a S-algebra, because there is no map S(∅) → ∅ given that S(∅) = {∅: ∅ → S}, hence X = ∅. Next, if Φ = ε: PX → S, namely the function constant to 0, then c(Φ) = {ε: X → S} therefore one can easily see that the left-hand side of (17) is equal to {a(ε: X → S)}. For the same reason, the righthand side is also equal to {a(ε: X → S)}. Moreover, if Φ(∅) = 0, then there is no u: *supp* Φ → X such that u(∅) ∈ ∅, so c(Φ) = ∅ and so is the left-hand side of (17); for the same reason, also the right-hand side is empty.

Suppose then, for the rest of the proof, that Φ = 0 and that Φ(∅)=0.

For the right-to-left inclusion in (17): given ψ ∈ c(Φ), consider Ψ = η<sup>S</sup> <sup>S</sup><sup>X</sup>(ψ) = <sup>Δ</sup><sup>ψ</sup> ∈ S<sup>2</sup>X. Then <sup>Ψ</sup> clearly satisfies all the required properties and <sup>μ</sup><sup>S</sup> <sup>X</sup>(Ψ) = ψ.

The left-to-right inclusion is more laborious. Let Ψ ∈ S - <sup>2</sup>X be such that <sup>χ</sup>∈S<sup>X</sup> <sup>Ψ</sup>(χ)=1 and such that *supp* <sup>Ψ</sup> <sup>⊆</sup> <sup>c</sup>(Φ), that is, for all <sup>ϕ</sup> <sup>∈</sup> *supp* <sup>Ψ</sup> there is <sup>u</sup><sup>ϕ</sup> : *supp* <sup>Φ</sup> <sup>→</sup> <sup>X</sup> such that <sup>u</sup><sup>ϕ</sup>(A) <sup>∈</sup> <sup>A</sup> for all <sup>A</sup> <sup>∈</sup> *supp* <sup>Φ</sup> and <sup>ϕ</sup> <sup>=</sup> <sup>S</sup>(u<sup>ϕ</sup> - )(Φ). We have to show that a(μ(Ψ)) = a(ψ) for some ψ ∈ SX of the form <sup>A</sup>∈supp <sup>Φ</sup> <sup>Φ</sup>(A)· <sup>u</sup>(A) for some choice function <sup>u</sup>: *supp* <sup>Φ</sup> <sup>→</sup> <sup>X</sup>. Notice that the given Ψ is a convex linear combination of functions ϕ's in SX like the one we have to produce: the trick will be to exploit the fact that each A ∈ *supp* Φ is convex. Here we shall only give a sketch of the proof. Suppose *supp* Φ = {A1,...,A<sup>n</sup>} and *supp* <sup>Ψ</sup> <sup>=</sup> {ϕ<sup>1</sup>,...,ϕ<sup>m</sup>}. Call <sup>u</sup><sup>j</sup> the choice function that generates <sup>ϕ</sup><sup>j</sup> . Then Ψ is of this form:

$$\Psi = \left( \underbrace{\begin{pmatrix} u^1(A\_1) \mapsto \Phi(A\_1) \\\\ \vdots \\ u^1(A\_n) \mapsto \Phi(A\_n) \end{pmatrix}}\_{\varphi^1} \mapsto \Psi(\varphi^1), \dots, \underbrace{\begin{pmatrix} u^m(A\_1) \mapsto \Phi(A\_1) \\\\ \vdots \\ u^m(A\_n) \mapsto \Phi(A\_n) \end{pmatrix}}\_{\varphi^m} \mapsto \Psi(\varphi^m) \right).$$

Define the following element of <sup>S</sup><sup>2</sup>X:

$$\Psi' = \left( \underbrace{\begin{pmatrix} u^1(A\_1) \mapsto \Psi(\varphi^1) \\\\ \vdots \\ u^m(A\_1) \mapsto \Psi(\varphi^m) \end{pmatrix}}\_{\chi^1} \mapsto \Phi(A\_1), \dots, \underbrace{\begin{pmatrix} u^1(A\_n) \mapsto \Psi(\varphi^1) \\\\ \vdots \\ u^m(A\_n) \mapsto \Psi(\varphi^m) \end{pmatrix}}\_{\chi^n} \mapsto \Phi(A\_n) \right).$$

Observe that <sup>u</sup><sup>1</sup>(Ai),...,u<sup>m</sup>(Ai) <sup>∈</sup> <sup>A</sup><sup>i</sup> by definition, and <sup>A</sup><sup>i</sup> is convex by assumption: since m <sup>j</sup>=1 <sup>Ψ</sup>(ϕ<sup>j</sup> )=1, we have that <sup>a</sup>(χ<sup>i</sup> ) <sup>∈</sup> <sup>A</sup>i. Set then <sup>u</sup>(Ai) = <sup>a</sup>(χ<sup>i</sup> ) and define ψ = S(a)(Ψ ): we have ψ ∈ c(Φ) with u as the generating choice function. It is not difficult to see that μ<sup>S</sup> <sup>X</sup>(Ψ) = μ<sup>S</sup> <sup>X</sup>(Ψ ), therefore we have

$$a(\psi) = a\left(\mathcal{S}(a)(\Psi')\right) = a\left(\mu\_X^{\mathcal{S}}(\Psi')\right) = a\left(\mu\_X^{\mathcal{S}}(\Psi)\right).$$

as desired.

The rest of the proof of Theorem 24, concerning the action of <sup>P</sup>˜ on morphisms and the unit and multiplication of the monad <sup>P</sup>˜, consists in following the recipe provided by Garner [15].

#### 6 The Composite Monad: an Algebraic Presentation

We can now compose the two monads P and S by considering the monad arising from the composition of the following two adjunctions:

Direct calculations show that the resulting endofunctor on Set, which we call P<sup>c</sup>S, maps a set X and a function f : X → Y into, respectively,

$$\mathcal{P}\_c \mathcal{S} X = \mathcal{P}\_c^{\mu\_X^S}(\mathcal{S} X) \qquad \text{and} \qquad \mathcal{P}\_c \mathcal{S}(f)(\mathcal{A}) = \{\mathcal{S}(f)(\Phi) \mid \Phi \in \mathcal{A}\} \tag{18}$$

for all A∈P<sup>c</sup>SX. For all sets X, ηPc<sup>S</sup> <sup>X</sup> : X → P<sup>c</sup>SX and μPc<sup>S</sup> <sup>X</sup> : P<sup>c</sup>SP<sup>c</sup>SX → P<sup>c</sup>SX are defined as

$$\eta\_X^{\mathcal{P}\_c \mathcal{S}}(x) = \{\Delta\_x\} \qquad \text{and} \qquad \mu\_X^{\mathcal{P}\_c \mathcal{S}}(\mathcal{Q}) = \bigcup\_{\Omega \in \mathcal{M}} \alpha\_{\mu\_X^{\mathcal{S}}}(\mathcal{Q}) \tag{19}$$

for all x ∈ X and A ∈ P<sup>c</sup>SP<sup>c</sup>SX.

Theorem 29. *Let* S *be a positive semifield. Then the canonical weak distributive law* <sup>δ</sup> : SP → PS *given in Theorem 21 induces a monad* <sup>P</sup><sup>c</sup><sup>S</sup> *on* <sup>S</sup>et *with endofunctor, unit and multiplication defined as in* (18) *and* (19)*.*

Recall from Remark 25 that the monad <sup>C</sup> : EM(S) <sup>→</sup> EM(S) from [25] coincides with our lifting <sup>P</sup>˜ modulo the absence of the empty set. The same happens for the composite monad, which is named CM in [25]. The absence of ∅ in CM turns out to be rather problematic for Jacobs. Indeed, in order to use the standard framework of coalgebraic trace semantics [20], one would need the Kleisli category <sup>K</sup>l(CM) to be enriched over CPPO, the category of <sup>ω</sup>-complete partial orders with *bottom* and continuous functions. <sup>K</sup>l(CM) is not CPPO-enriched since there is no bottom element in CM(X). Instead, in P<sup>c</sup>SX the bottom is exactly the empty set; moreover, <sup>K</sup>l(P<sup>c</sup>S) enjoys the properties required by [20].

Theorem 30. *The category* <sup>K</sup>l(P<sup>c</sup>S) *is enriched over* CPPO *and satisfies the left-strictness condition: for all* f : X → P<sup>c</sup>SY *and* Z *set,* ⊥Y,Z ◦ f = ⊥X,Z*.*

It is immediate that every homset in <sup>K</sup>l(P<sup>c</sup>S) carries a complete partial order. Showing that composition of arrows in <sup>K</sup>l(P<sup>c</sup>S) preserves joins (of <sup>ω</sup>-chains) requires more work: the proof, omitted here, crucially relies on the algebraic theory presenting the monad P<sup>c</sup>S, illustrated next.

An Algebraic Presentation. Recall that an *algebraic theory* is a pair T = (Σ,E) where Σ is a *signature*, whose elements are called *operations*, to each of which is assigned a cardinal number called its *arity*, while E is a class of formal *equations* between Σ-terms. An *algebra* for the theory T is a set A together with, for each operation <sup>o</sup> of arity <sup>κ</sup> in <sup>Σ</sup>, a function <sup>o</sup><sup>A</sup> : <sup>A</sup><sup>κ</sup> <sup>→</sup> <sup>A</sup> satisfying the equations of E. A *homomorphism* of algebras is a function f : A → B respecting the operations of Σ in their realisations in A and B. Algebras and homomorphisms of an algebraic theory <sup>T</sup> form a category <sup>A</sup>lg(<sup>T</sup> ).

Definition 31. *Let* <sup>M</sup> *be a monad on* <sup>S</sup>et*, and* <sup>T</sup> *an algebraic theory. We say that* <sup>T</sup> presents <sup>M</sup> *if and only if* EM(M) *and* <sup>A</sup>lg(<sup>T</sup> ) *are isomorphic.*

Left S-semimodules are algebras for the theory LSM = (ΣLSM, ELSM) where Σ<sup>S</sup>LSM = {+, 0}∪{λ · | λ ∈ S} and ELSM is the set of axioms in Table 1. As already mentioned in Section 3, left S-semimodules are exactly Salgebras and morphisms of S-semimodules coincide with those of S-algebras. Thus, the theory LSM presents the monad S.

Similarly, semilattices are algebras for the theory SL = (ΣSL, ESL) where ΣSL = {, ⊥} and ESL is the set of axioms in Table 1. It is well known that semilattices are algebras for the *finite* powerset monad. Actually, this monad is presented by SL. In order to present the full powerset monad P we need to take joins of arbitrary arity. A *complete semilattice* is a set X equipped with joins <sup>x</sup>∈<sup>A</sup> <sup>x</sup> for all–not necessarily finite–<sup>A</sup> <sup>⊆</sup> <sup>X</sup>. Formally the (infinitary) theory of *complete semilattices* is given as CSL = (ΣCSL, ECSL) where ΣCSL = { <sup>I</sup> | I set} and ECSL is the set of axioms displayed in Table 3 (for a detailed treatment of infinitary algebraic theories see, for example, [30]).

We can now illustrate the theory (Σ,E) presenting the composed monad P<sup>c</sup>S: the operations in Σ are exactly those of complete semilattices and S- Table 3. The sets of axioms ECSL for complete semilattices: the second axiom generalises the usual idempotency and commutativity properties of finitary , while the third one generalises associativity and neutrality of <sup>∅</sup> <sup>=</sup> <sup>⊥</sup>.

$$\begin{aligned} \bigsqcup\_{i \in \{0\}} x\_i &= x\_0 \\ \bigsqcup\_{j \in J} x\_j &= \bigsqcup\_{i \in I} x\_{f(i)} \text{ for all } f \colon I \to J \text{ surjective} \\ \bigsqcup\_{i \in I} x\_i &= \bigsqcup\_{j \in J} \bigsqcup\_{i \in J^{-1}\{j\}} x\_i \text{ for all } f \colon I \to J \end{aligned}$$

semimodules, while the axioms are those of complete semilattices and S-semimodules together with the set E<sup>D</sup> of *distributivity* axioms illustrated below.

$$\lambda \cdot \bigsqcup\_{i \in I} x\_i = \bigsqcup\_{i \in I} \lambda \cdot x\_i \quad \text{for } \lambda \neq 0, \qquad \bigsqcup\_{i \in I} x\_i + \bigsqcup\_{j \in J} y\_j = \bigsqcup\_{(i,j) \in I \times J} x\_i + y\_j \tag{20}$$

In short, Σ = ΣCSL ∪ ΣLSM and E = ECSL ∪ ELSM ∪ ED.

Theorem 32. *The monad* P<sup>c</sup>S *is presented by the algebraic theory* (Σ,E)*.*

The presentation crucially relies on the fact that P<sup>c</sup>S is obtained by composing P and S via δ. Indeed, we know from general results in [11,15] that P<sup>c</sup>Salgebras are in one to one correspondence with δ-algebras [3], namely triples (X, a, b) such that a: SX → X is a S-algebra, b : PX → X is a P-algebra and the following diagram commutes.

The S-algebra a corresponds to a S-semimodule (X, +, 0, λ·), the P-algebra b to a complete lattice (X, <sup>I</sup> ) and the commutativity of diagram (21) expresses exactly the distributivity axioms in (20).

*Example 33.* Let <sup>S</sup> be <sup>R</sup><sup>+</sup> and let [a, b] with a, b <sup>∈</sup> <sup>R</sup><sup>+</sup> denote the set {<sup>x</sup> <sup>∈</sup> <sup>R</sup><sup>+</sup> <sup>|</sup> <sup>a</sup> <sup>≤</sup> <sup>x</sup> <sup>≤</sup> <sup>b</sup>} and [a, <sup>∞</sup>) the set {<sup>x</sup> <sup>∈</sup> <sup>R</sup><sup>+</sup> <sup>|</sup> <sup>a</sup> <sup>≤</sup> <sup>x</sup>}. For 1 = {x}, <sup>P</sup><sup>c</sup>S(1) = {∅} ∪ {[a, b] <sup>|</sup> a, b <sup>∈</sup> <sup>R</sup><sup>+</sup>}∪{[a, <sup>+</sup>∞) <sup>|</sup> <sup>a</sup> <sup>∈</sup> <sup>R</sup><sup>+</sup>}. The <sup>P</sup><sup>c</sup>S-algebra <sup>μ</sup>Pc<sup>S</sup> <sup>1</sup> : P<sup>c</sup>SP<sup>c</sup>S1 → <sup>P</sup><sup>c</sup>S<sup>1</sup> induces a <sup>δ</sup>-algebra where the structure of complete lattice is given as<sup>2</sup>

$$\bigsqcup\_{i \in I} A\_i = \begin{cases} [\inf\_{i \in I} a\_i, \sup\_{i \in I} b\_i] & \text{if, for all } i \in I, \ A\_i = [a\_i, b\_i] \wedge \sup\_{i \in I} b\_i \in \mathbb{R}^+ \\\ [\inf\_{i \in I} a\_i, \infty) & \text{otherwise} \end{cases}$$

The R<sup>+</sup>-semimodule is as expected, e.g., [a1, b1]+[a2, b2]=[a<sup>1</sup> + a2, b<sup>1</sup> + b2].

<sup>2</sup> For the sake of brevity, we are ignoring the case where some <sup>A</sup><sup>i</sup> <sup>=</sup> <sup>∅</sup>.

Finite Joins and Finitely Generated Convex Sets. We now consider the algebraic theory (Σ , E ) obtained by restricting (Σ,E) to finitary joins. More precisely, we fix

$$
\Sigma' = \Sigma\_{\mathcal{SL}} \cup \Sigma\_{\mathcal{LSM}} \qquad E' = E\_{\mathcal{SL}} \cup E\_{\mathcal{LSM}} \cup E\_{\mathcal{D'}}
$$

where (ΣSL, ESL) is the algebraic theory for semilatices, (ΣLSM, ELSM) is the one for S-semimodules, and ED is the set of distributivity axioms illustrated in Table 1. Thanks to the characterisation provided by Theorem 32, we easily obtain a function translating Σ -terms into convex subsets.

Proposition 34. *Let* T<sup>Σ</sup>-,E- (X) *be the set of* Σ *-terms with variables in* X *quotiented by* E *. Let* [[·]]<sup>X</sup> : T<sup>Σ</sup>-,E-(X) → P<sup>c</sup>S(X) *be the function defined as*

$$\begin{array}{llll} \left[x\right] = \left\{\Delta\_{x}\right\} \text{ for } x \in X & \left[\lambda \cdot t\right] = \left\{\begin{matrix} \lambda \cdot t^{\mathcal{S}} & f \ \mid \ f \in \left[t\right] \end{matrix} \right\} & \left\{\begin{matrix} f \ \lambda \neq 0 \\ \emptyset^{\mu^{\mathcal{S}}} \end{matrix} \right\} \\ \left[1\right] = \left\{\emptyset^{\mu^{\mathcal{S}}}\right\} & \left[t\_{1} + t\_{2}\right] = \left\{f\_{1} + t\_{2}\right\} = \left\{f\_{1} + \prescript{\mathcal{S}}{}{f\_{2}} \mid f\_{1} \in \left[t\_{1}\right], \ f\_{2} \in \left[t\_{2}\right] \right\} \\ \left[t\_{1} \sqcup t\_{2}\right] = \overline{\left[t\_{1}\right] \cup \left[t\_{2}\right]}^{\mu^{\mathcal{S}}} \end{array}$$

*Let* [[·]]: T<sup>Σ</sup>-,E- → P<sup>c</sup>S *be the family* {[[·]]<sup>X</sup>}<sup>X</sup>∈|Set|*. Then* [[·]]: T<sup>Σ</sup>-,E- → P<sup>c</sup>S *is a map of monads and, moreover, each* [[·]]<sup>X</sup> : T<sup>Σ</sup>-,E-(X) → P<sup>c</sup>S(X) *is injective.*

We say that a set A∈P<sup>c</sup>S(X) is *finitely generated* if there exists a finite set B⊆S(X) such that B = A. We write Pf cS(X) for the set of all A∈P<sup>c</sup>S(X) that are finitely generated. The assignment X → Pf cS(X) gives rise to a monad <sup>P</sup>f c<sup>S</sup> : <sup>S</sup>et <sup>→</sup> <sup>S</sup>et where the action on functions, the unit and the multiplication are defined as for P<sup>c</sup>S.

Theorem 35. *The monads* T<sup>Σ</sup>-,E *and* Pf cS *are isomorphic. Therefore* (Σ , E ) *is a presentation for the monad* Pf cS*.*

*Example 36.* Recall <sup>P</sup><sup>c</sup>S(1) for <sup>S</sup> <sup>=</sup> <sup>R</sup><sup>+</sup> from Example 33. By restricting to the finitely generated convex sets, one obtains Pf cS(1) = {∅} ∪ {[a, b] | a, b ∈ <sup>R</sup><sup>+</sup>}, that is the sets of the form [a, <sup>∞</sup>) are not finitely generated. Table 4 illustrates the isomorphism [[·]]: T<sup>Σ</sup>-,E- (1) → P<sup>c</sup>S(1). It is worth observing that every closed interval [a, b] is denoted by a term in T<sup>Σ</sup>-,E- (1) for 1 = {x}: indeed, [[(a · x) (b · x)]] = [a, b]. For 2 = {x, y}, Pf cS(2) is the set containing all convex polygons: for instance the term (r<sup>1</sup> · x + s<sup>1</sup> · y) (r<sup>2</sup> · x + s<sup>2</sup> · y) (r<sup>3</sup> · x + s<sup>3</sup> · y) denote a triangle with vertexes (ri, si). For n = {x0,...x<sup>n</sup>−<sup>1</sup>}, it is easy to see that Pf cS(n) contains all convex n-polytopes.

#### 7 Conclusions: Related and Future Work

Our work was inspired by [17] where Goy and Petrisan compose the monads of powerset and probability distributions by means of a weak distributive law in the sense of Garner [15]. Our results also heavily rely on the work of Clementino Table 4. The inductive definition of the function [[·]]<sup>1</sup> : T<sup>Σ</sup>-,E-(1) → PcS(1) for 1 = {x}.

$$\begin{aligned} \left[\lambda \cdot t\right] &= \begin{cases} \left[\lambda \cdot a, \lambda \cdot b\right] & \text{if } \lambda \neq 0, \ [t] = [a, b] \\ \emptyset & \text{if } \lambda \neq 0, \ [t] = \emptyset \end{cases} \\ \left[x\right] &= \left[1, 1\right] \\ \left[\emptyset\right] &= \left[0, 0\right] \\ \left[\bot\right] &= \emptyset \end{aligned} \quad \left[\begin{matrix} [1, t\_1 + t\_2] \\ \end{matrix}\right] = \begin{matrix} \left[a\_1 + a\_2, b\_1 + b\_2\right] & \text{if } \left[t\_i\right] = \left[a\_i, b\_i\right] \\ \emptyset & \text{otherwise} \end{matrix} \\ \left[\begin{matrix} [1, \bot] \\ \end{matrix}\right] &= \begin{cases} \left[\min a\_i, \max b\_i\right] & \text{if } \left[t\_i\right] = \left[a\_i, b\_i\right] \\ \left[a\_1, b\_1\right] & \text{if } \left[t\_1\right] = \left[a\_1, b\_1\right], \ \left[t\_2\right] = \emptyset \\ \left[a\_2, b\_2\right] & \text{if } \left[t\_2\right] = \left[a\_2, b\_2\right], \ \left[t\_1\right] = \emptyset \\ \emptyset & \text{otherwise} \end{aligned} \quad \left[\text{if } \lambda\right] = \emptyset$$

et al. [12] that illustrates necessary and sufficient conditions on a semiring S for the existence of a weak distributive law δ : SP → PS. However, to the best of our knowledge, the alternative characterisation of δ provided by Theorem 21 was never shown.

Such characterisation is essential for giving a handy description of the lifting <sup>P</sup>˜ : EM(S) <sup>→</sup> EM(S) (Theorem 24) as well as to observe the strong relationships with the work of Jacobs (Remark 25) and the one of Klin and Rot (Remark 23). The weak distributive law δ also plays a key role in providing the algebraic theories presenting the composed monad P<sup>c</sup>S (Theorem 24) and its finitary restriction Pf cS (Theorem 35). These two theories resemble those appearing in, respectively, [17] and [10] where the monad of probability distributions plays the role of the monad S in our work.

Theorem 30 allows to reuse the framework of coalgebraic trace semantics [20] for modelling over <sup>K</sup>l(P<sup>c</sup>S) systems with both nondeterminism and quantitative features. The alternative framework based on coalgebras over EM(P<sup>c</sup>S) directly leads to *nondeterministic weighted automata*. A proper comparison with those in [13] is left as future work. Thanks to the abstract results in [7], language equivalence for such coalgebras could be checked by means of coinductive upto techniques. It is worth remarking that, since δ is a weak distributive law, then thanks to the work in [16], up-to techniques are also sound for "convexbisimilarity" (in coalgebraic terms, behavioural equivalence for the lifted functor <sup>P</sup>˜ : EM(S) <sup>→</sup> EM(S)).

We conclude by recalling that we have two main examples of positive semifields: Bool and R<sup>+</sup>. Booleans could lead to a coalgebraic modal logic and trace semantics for *alternating automata* in the style of [27]. For R<sup>+</sup>, we hope that exploiting the ideas in [34] our monad could shed some lights on the behaviour of linear dynamical systems featuring some sort of nondeterminism.

### References


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

#### **One-way Resynchronizability of Word Transducers***-*

Sougata Bose<sup>1</sup>, S.N. Krishna<sup>2</sup>, Anca Muscholl<sup>1</sup>, and -Gabriele Puppis<sup>3</sup>

<sup>1</sup> LaBRI, University of Bordeaux, Bordeaux, France

<sup>2</sup> Dept. of Computer Science & Engineering IIT Bombay, Bombay, India <sup>3</sup> Dept. of Mathematics, Computer Science, and Physics, Univ. of Udine, Udine, Italy gabriele.puppis@uniud.it

**Abstract.** The origin semantics for transducers was proposed in 2014, and it led to various characterizations and decidability results that are in contrast with the classical semantics. In this paper we add a further decidability result for characterizing transducers that are close to oneway transducers in the origin semantics. We show that it is decidable whether a non-deterministic two-way word transducer can be resynchronized by a bounded, regular resynchronizer into an origin-equivalent oneway transducer. The result is in contrast with the usual semantics, where it is undecidable to know if a non-deterministic two-way transducer is equivalent to some one-way transducer.

**Keywords:** String transducers · Resynchronizers · One-way transducers

### **1 Introduction**

Regular word-to-word functions form a robust and expressive class of transformations, as they correspond to deterministic two-way transducers, to deterministic streaming string transducers [1], and to monadic second-order logical transductions [11]. However, the transition from word languages to functions over words is often quite tricky. One of the challenges is to come up with effective characterizations of restricted transformations. A first example is the characterization of functions computed by one-way transducers (known as *rational functions*). It turns out that it is decidable whether a regular function is rational [14], but the algorithm is quite involved [3]. In addition, non-determinism makes the problem intractable: it is undecidable whether the relation computed by a nondeterministic two-way transducer can be also computed by a one-way transducer, [2]. A second example is the problem of knowing whether a regular word function can be described by a first-order logical transduction. This question is still open in general [16], and it is only known how to decide if a *rational* function is definable in first-order logic [13].

Word transducers with origin semantics were introduced by Boja´nczyk [4] and shown to provide a machine-independent characterization of regular word-

<sup>-</sup>Work supported by ANR DeLTA (ANR-16-CE40-0007) and ReLaX.

<sup>©</sup> The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 124–143, 2021.

https://doi.org/10.1007/978-3-030-71995-1\_7

Fig. 1: On the left, an input-output pair for a transducer T that reads wd and outputs dw, d ∈ Σ, w ∈ Σ∗, the arrows denoting origins. On the right, the same input-output pair, but with origins modified by a resynchronizer R. The resynchronized relation R(T) is order-preserving, and T is one-way resynchronizable.

to-word functions. The origin semantics, as the name suggests, means tagging the output by the positions of the input that generated that output.

A nice phenomenon is that origins can restore decidability for some interesting problems. For example, the equivalence of word relations computed by one-way transducers, which is undecidable in the classical semantics [18,19], is PSPACE-complete for two-way non-deterministic transducers in the origin semantics [7]. Another, deeper, observation is that the origin semantics provides an algebraic approach that can be used to decide fragments. For example, [4] provides an effective characterization of first-order definable word functions under the origin semantics. As for the problem of knowing whether a regular word function is rational, it becomes almost trivial in the origin semantics.

A possible objection against the origin semantics is that the comparison of two transducers in the origin semantics is too strict. Resynchronizations were proposed in order to overcome this issue. A resynchronization is a binary relation between input-output pairs with origins, that preserves the input and the output, changing only the origins. Resynchronizations were introduced for one-way transducers [15], and later for two-way transducers [7]. For one-way transducers *rational* resynchronizations are transducers acting on the synchronization languages, whereas for two-way transducers, *regular* resynchronizations are described by regular properties over the input that restrict the change of origins. The class of bounded<sup>4</sup> regular resynchronizations was shown to behave very nicely, preserving the class of transductions defined by non-deterministic, twoway transducers: for any bounded regular resynchronization R and any two-way transducer T, the resynchronized relation R(T) can be computed by another two-way transducer [7]. In particular, non-deterministic, two-way transducers can be effectively compared modulo bounded regular resynchronizations.

As mentioned above, it is easy to know if a two-way transducer is equivalent under the origin semantics to some one-way transducer [4], since this is equivalent to being order-preserving. But what happens if this is not the case? Still, the given transducer T can be "close" to some order-preserving transducer. What we mean here by "close" is that there exists some bounded regular resyn-

<sup>4</sup> "Bounded" refers here to the number of source positions that are mapped to the same target position. It rules out resynchronizations such as the universal one.

chronizer R such that R(T) is order-preserving and all input-output pairs with origins produced by T are in the domain of R. We call such transducers *one-way resynchronizable*. Figure 1 gives an example.

In this paper we show that it is decidable if a two-way transducer is one-way resynchronizable. We first solve the problem for bounded-visit two-way transducers. A bounded-visit transducer is one for which there is a uniform bound for the number of visits of any input position. Then, we use the previous result to show that one-way resynchronizability is decidable for arbitrary two-way transducers, so without the bounded-visit restriction. This is done by constructing, if possible, a bounded, regular resynchronization from the given transducer to a boundedvisit transducer with regular language outputs. Finally, we show that bounded regular resynchronizations are closed under composition, and this allows to combine the previous construction with our decidability result for bounded-visit transducers.

*Related work and paper overview.* The synthesis problem for resynchronizers asks to compute a resynchronizer from one transducer to another one, when the two transducers are given as input. The problem was studied in [6] and shown to be decidable for unambiguous two-way transducers (it is open for unrestricted transducers). The paper [21] shows that the containment version of the above problem is undecidable for unrestricted one-way transducers.

The origin semantics for streaming string transducers (SST) [1] has been studied in [5], providing a machine-independent characterization of the sets of origin graphs generated by SSTs. An open problem here is to characterize origin graphs generated by aperiodic streaming string transducers [10,16]. Going beyond words, [17] investigates decision problems of tree transducers with origin, and regains the decidability of the equivalence problem for non-deterministic top-down and MSO transducers by considering the origin semantics. An open problem for tree transducers with origin is that of synthesizing resynchronizers as in the word case.

We will recall regular resynchronizations in Section 3. Section 4 provides the proof ingredients for the bounded-visit case, and the proof of decidability of one-way resynchronizability in the bounded-visit case can be found in Section 5. Finally, in Section 6 we sketch the proof in the general case. A full version of the paper is available at https://arxiv.org/abs/2101.08011.

#### **2 Preliminaries**

Let Σ be a finite input alphabet. Given a word w ∈ Σ<sup>∗</sup> of length |w| = n, a *position* is an element of its domain dom(w) = {1,...,n}. For every position i, w(i) denotes the letter at that position. A *cut* of w is any number from 1 to |w| + 1, so a cut identifies a position *between* two consecutive letters of the input. The cut i = 1 represents the position just before the first input letter, and i = |w| + 1 the position just after the last letter of w.

*Two-way transducers.* We use two-way transducers as defined in [3,6], with a slightly different presentation than in classical papers such as [22]. As usual for two-way machines, for any input w ∈ Σ∗, w(0) = and w(|w| + 1) = &, where , & ∈/ Σ are special markers used as delimiters.

A *two-way transducer* (or just *transducer* from now on) is a tuple T = (Q, Σ, Γ, Δ, I, F), where Σ,Γ are respectively the input and output alphabets, Q = Q<sup>≺</sup> Q is the set of states, partitioned into left-reading states from Q<sup>≺</sup> and right-reading states from Q , I ⊆ Q is the set of initial states, F ⊆ Q is the set of final states, and Δ ⊆ Q× (Σ {, &}) × Γ<sup>∗</sup> ×Q is the finite transition relation. Left-reading states read the letter to the left, whereas right-reading states read the letter to the right. This partitioning will also determine the head movement during a transition, as explained below.

As usual, to define runs of transducers we first define configurations. Given a transducer T and a word w ∈ Σ∗, a *configuration* of T on w is a state-cut pair (q, i), with q ∈ Q and 1 ≤ i ≤ |w| + 1. A configuration (q, i), 1 ≤ i ≤ |w| + 1 means that the automaton is in state q and its head is between the (i − 1)-th and the i-th letter of w. The transitions that depart from a configuration (q, i) and read <sup>a</sup> are denoted (q, i) <sup>a</sup>−→ (q , i ), and must satisfy one of the following: = i + 1,

(1) q ∈ Q , q ∈ Q , a = w(i), (q, a, v, q ) ∈ Δ, and i (2) q ∈ Q , q ∈ Q≺, a = w(i), (q, a, v, q ) ∈ Δ, and i = i,

(3) q ∈ Q≺, q ∈ Q , a = w(i − 1), (q, a, v, q ) ∈ Δ, and i = i,

(4) q ∈ Q≺, q ∈ Q≺, a = w(i − 1), (q, a, v, q ) ∈ Δ, and i = i − 1. When T has only right-reading states (i.e. Q<sup>≺</sup> = ∅), its head can only move rightward. In this case we call T a *one-way transducer*.

<sup>A</sup> *run* of <sup>T</sup> on <sup>w</sup> is a sequence <sup>ρ</sup> = (q1, i1) <sup>a</sup>j<sup>1</sup> <sup>|</sup>v<sup>1</sup> −→ (q2, i2) <sup>a</sup>j<sup>2</sup> <sup>|</sup>v<sup>2</sup> −→ · · · <sup>a</sup>jm|v<sup>m</sup> −→ (q<sup>m</sup>+1, i<sup>m</sup>+1) of configurations connected by transitions. Note that the positions j1, j2,...,j<sup>m</sup> of letters do not need to be ordered from smaller to bigger, and can differ slightly (by +1 or −1) from the cuts i1, i2,...,i<sup>m</sup>+1, since cuts take values in between consecutive letters.

A configuration (q, i) on w is *initial* (resp. *final*) if q ∈ I and i = 1 (resp. q ∈ F and i = |w| + 1). A run is *successful* if it starts with an initial configuration and ends with a final configuration. The *output* associated with a successful run ρ as above is the word v1v<sup>2</sup> ··· v<sup>m</sup> ∈ Γ∗. A transducer T defines a relation [[T]] ⊆ Σ<sup>∗</sup> ×Γ<sup>∗</sup> consisting of all the pairs (u, v) such that v is the output of some successful run ρ of T on u.

*Origin semantics.* In the origin semantics for transducers [4] the output is tagged with information about the position of the input where it was produced. If reading the i-th letter of the input we output v, then all letters of v are tagged with i, and we say they have *origin* i. We use the notation (v, i) for v ∈ Γ<sup>∗</sup> to denote that all positions in the output word v have origin i, and we view (v, i) as word over the alphabet <sup>Γ</sup> <sup>×</sup> <sup>N</sup>. The outputs associated with a successful run <sup>ρ</sup> = (q1, i1) <sup>b</sup>1|v<sup>1</sup> −→ (q2, i2) <sup>b</sup>2|v<sup>2</sup> −→ (q3, i3)··· <sup>b</sup>m|v<sup>m</sup> −→ (q<sup>m</sup>+1, i<sup>m</sup>+1) in the origin semantics are the words of the form <sup>ν</sup> = (v1, j1)(v2, j2)···(vm, jm) over <sup>Γ</sup> <sup>×</sup> <sup>N</sup> where, for all 1 ≤ k ≤ m, j<sup>k</sup> = i<sup>k</sup> if q<sup>k</sup> ∈ Q , and j<sup>k</sup> = i<sup>k</sup> − 1 if q<sup>k</sup> ∈ Q≺. Under the origin semantics, the relation defined by T, denoted [[T]]o, is the set of pairs <sup>σ</sup> = (u, ν) —called *synchronized pairs*— such that <sup>u</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> and <sup>ν</sup> <sup>∈</sup> (<sup>Γ</sup> <sup>×</sup> <sup>N</sup>)<sup>∗</sup> is the output of some successful run on u.

Equivalently, a synchronized pair (u, ν) can be described as a triple (u, v, *orig*), where v is the projection of ν on Γ, and *orig* : dom(v) → dom(u) associates with each position of v its origin in u. So for ν = (v1, j1)(v2, j2)···(vm, jm) as above, v = v<sup>1</sup> ...vm, and, for all positions i s.t. |v<sup>1</sup> ...v<sup>k</sup>−<sup>1</sup>| < i ≤ |v<sup>1</sup> ...v<sup>k</sup>|, we have *orig*(i) = jk. Given two transducers T1, T2, we say they are *origin-equivalent* if [[T1]]<sup>o</sup> = [[T2]]o. Note that two transducers T1, T<sup>2</sup> can be equivalent in the classical semantics, [[T1]] = [[T2]], while they can have different origin semantics, so [[T1]]<sup>o</sup> = [[T2]]o.

*Bounded-visit transducers.* Let k > 0 be some integer, and ρ some run of a two-way transducer T. We say that ρ is k*-visit* if for every i ≥ 0, it has at most k occurrences of configurations from Q × {i}. We call a transducer T k*-visit* if for every σ ∈ [[T]]<sup>o</sup> there is some successful, k-visit run ρ of T with output σ (actually we should call the transducer k-visit *in the origin semantics*, but for simplicity we omit this). For example, the relation {(w, w) | w ∈ Σ<sup>∗</sup>}, where w denotes the reverse of w, can be computed by a 3-visit transducer. A transducer is called *bounded-visit* if it is k-visit for some k.

*Common guess.* It is often useful to work with a variant of two-way transducers that can guess beforehand some annotation on the input and inspect it consistently when visiting portions of the input multiple times. This feature is called *common guess* [5], and strictly increases the expressive power of two-way transducers, including bounded-visit ones.

### **3 One-way resynchronizability**

#### **3.1 Regular resynchronizers**

Resynchronizations are used to compare transductions in the origin semantics. <sup>A</sup> *resynchronization* is a binary relation R ⊆ (Σ<sup>∗</sup>×(<sup>Γ</sup> <sup>×</sup>N)∗)<sup>2</sup> over synchronized pairs such that (σ, σ ) ∈ R implies that σ = (u, v, *orig*) and σ = (u, v, *orig* ) for some origin mappings *orig*, *orig* : dom(v) → dom(u). In other words, a resynchronization will only change the origin mapping, but neither the input, nor the output. Given a relation <sup>S</sup> <sup>⊆</sup> <sup>Σ</sup><sup>∗</sup> <sup>×</sup>(<sup>Γ</sup> <sup>×</sup>N)<sup>∗</sup> with origins, the *resynchronized relation* R(S) is defined as R(S) = {σ | (σ, σ ) ∈ R, σ ∈ S}. For a transducer T we abbreviate R([[T]]o) by R(T). The typical use of a resynchronization R is to ask, given two transducers T,T , whether R(T) and T are origin-equivalent.

Regular resynchronizers (originally called MSO resynchronizers) were introduced in [7] as a resynchronization mechanism that preserves definability by two-way transducers. They were inspired by MSO (monadic second-order) transductions [9,12] and they are formally defined as follows. A *regular resynchronizer* is a tuple R = (I, O, ipar, opar,(move<sup>τ</sup> )<sup>τ</sup> ,(nextτ,τ- )τ,τ-) consisting of


To apply a regular resynchronizer as above, one first guesses the valuation of all the predicates I<sup>j</sup> , Ok, and uses it to interpret the parameters I and O. Based on the chosen valuation of the parameters O, each position x of the output v gets an associated *type* <sup>τ</sup><sup>x</sup> = (v(x), b1,...,bn) <sup>∈</sup> <sup>Γ</sup> × {0, <sup>1</sup>}<sup>n</sup>, where <sup>b</sup><sup>j</sup> is 1 or 0 depending on whether x ∈ O<sup>j</sup> or not. We refer to the output word together with the valuation of the output parameters as *annotated output*, so a word over <sup>Γ</sup> × {0, <sup>1</sup>}<sup>n</sup>. Similarly, the *annotated input* is a word over <sup>Σ</sup> × {0, <sup>1</sup>}<sup>m</sup>. The annotated input and output word must satisfy the formulas ipar and opar, respectively.

The origins of output positions are constrained using the formulas move<sup>τ</sup> and nextτ,τ- , which are *parametrized by output types and evaluated over the annotated input*. Intuitively, the formula move<sup>τ</sup> (y, z) states how the origin of every output position of type τ changes from y to z. We refer to y and z as *source* and *target* origin, respectively. The formula nextτ,τ- (z, z ) instead constrains the target origins z, z of any two consecutive output positions with types τ and τ , respectively.

Formally, R = (I, O, ipar, opar,(move<sup>τ</sup> ),(nextτ,τ- )) defines the resynchronization consisting of all pairs (σ, σ ), with σ = (u, v, *orig*), σ = (u, v, *orig* ), u ∈ Σ∗, and v ∈ Γ∗, for which there exist u ∈ Σ∗ and v ∈ Γ∗ such that


*Example 1.* Consider the following resynchronization R. A pair (σ, σ ) belongs to R if σ = (uv, uwv, *orig*), σ = (uv, uwv, *orig* ), with u, v, w <sup>∈</sup> <sup>Σ</sup><sup>+</sup>. The origins *orig* and *orig* are both the identity over u and v. The origin of every position of w in σ (hence a source origin) is either the first or the last position of v. The origin of every position of w in σ (a target origin) is the first position of v.

This resynchronization is described by a regular resynchronizer that uses two input parameters I1, I<sup>2</sup> to mark the last and the first positions of v in the input, and one output parameter O to mark the factor w in the output. The formula move<sup>τ</sup> (y, z) is either (I1(y)∨I2(y))∧I2(z) or (y = z), depending on whether the type τ describes a position inside w or a position outside w.

We now turn to describing some important restrictions on (regular) resynchronizers. Let R = (I, O, ipar, opar,(move<sup>τ</sup> ),(nextτ,τ-)) be a resynchronizer.


The boundedness restriction rules out resynchronizations such as the universal one, that imposes no restriction on the change of origins. It is a decidable restriction [7], and it guarantees that definability by two-way transducers is effectively preserved under regular resynchronizations, modulo common guess. More precisely, Theorem 16 in [7] shows that, given a bounded regular resynchronizer R and a transducer T, one can construct a transducer T with common guess that is origin-equivalent to R(T).

*Example 1 (continued).* Consider again the regular resynchronizer R described in the previous example. Note that R is 2-bounded, since at most two source origins are redirected to the same target origin. If we used an additional output parameter to distinguish, among the positions of w, those that have source origin in the first position of v and those that have source origin in the last position of v, we would get a 1-bounded, regular resynchronizer.

We state below two crucial properties of regular resynchronizers (the second lemma is reminiscent of Lemma 11 from [21], which proves closure of bounded resynchronizers with vacuous nextτ,τrelations).

**Lemma 1.** *Every bounded, regular resynchronizer is effectively equivalent to some* 1*-bounded, regular resynchronizer.*

**Lemma 2.** *The class of bounded, regular resynchronizers is effectively closed under composition.*

#### **3.2 Main result**

Given a two-way transducer T one can ask if it is origin-equivalent to some one-way transducer. It was observed in [4] that this property holds if and only if all synchronized pairs defined by T are *order-preserving*, namely, for all σ = (u, v, *orig*) ∈ [[T]]<sup>o</sup> and all y, y ∈ dom(v), with y<y , we have *orig*(y) ≤ *orig*(y ). The decidability of the above question should be contrasted to the analogous question in the classical semantics: "is a given two-way transducer classically equivalent to some one-way transducer?" The latter problem turns out to be decidable for functional transducers [14,3], but is undecidable for arbitrary twoway transducers [2].

Here we are interested in a different, more relaxed notion:

**Definition 1.** *A transducer* T *is called* one-way resynchronizable *if there exists a bounded, regular resynchronizer* R *that is* T*-preserving and such that* R(T) *is order-preserving.*

Note that if T is an order-preserving transducer, then one can construct rather easily a one-way transducer T such that T =<sup>o</sup> T, by eliminating nonproductive U-turns from accepting runs.

Moreover, note that without the condition of being T-preserving every transducer T would be one-way resynchronizable, using the empty resynchronization.

*Example 2.* Consider the transducer T<sup>1</sup> that moves the last letter of the input wa to the front by a first left-to-right pass that outputs the last letter a, followed by a right-to-left pass without output, and finally by a left-to-right pass that produces the remaining w. Let R be the bounded regular resynchronizer that redirects the origin of the last a to the first position. Assuming an output parameter O with an interpretation constrained by opar that marks the last position of the output, the formula move(a,1)(y, z) says that target origin z (source origin y, resp.) of the last a is the first (last, resp.) position of the input. It is easy to see that R(T1) is origin-equivalent to the one-way transducer that on input wa, guesses a and outputs aw. Thus, T<sup>1</sup> is one-way resynchronizable. See also Figure 1.

*Example 3.* Consider the transducer T<sup>2</sup> that reads inputs of the form u#v and outputs vu in the obvious way, by a first left-to-right pass that outputs v, followed by a right-to-left pass, and a finally a left-to-right pass that outputs u. Using the characterization with the notion of cross-width that we introduce below, it can be shown that T<sup>2</sup> is not one-way resynchronizable.

In order to give a flavor of our results, we anticipate here the two main theorems, before introducing the key technical concepts of cross-width and inversion (these will be defined further below).

**Theorem 1.** *For every bounded-visit transducer* T*, the following are equivalent:*


*Moreover, condition (3) is decidable.*

We will use Theorem 1 to show that one-way resynchronizability is decidable for arbitrary two-way transducers (not just bounded-visit ones).

**Theorem 2.** *It is decidable whether a given two-way transducer* T *is one-way resynchronizable.*

Let us now introduce the first key concept, that of cross-width:

**Definition 2 (cross-width).** *Let* σ = (u, v, *orig*) *be a synchronized pair and let* X1, X<sup>2</sup> ⊆ dom(v) *be sets of output positions such that, for all* x<sup>1</sup> ∈ X<sup>1</sup> *and* x<sup>2</sup> ∈ X2*,* x<sup>1</sup> < x<sup>2</sup> *and orig*(x1) > *orig*(x2)*. We call such a pair* (X1, X2) *a* cross *and define its* width *as* input output cross cross-width *X*1 *X*2 min(|*orig*(X1)|, |*orig*(X2)|)*, where orig*(X) = {*orig*(x) | x ∈ X} *is the set of origins corresponding to a set* X *of output positions. The* cross-width *of a synchronized pair* σ *is the maximal width of the crosses in* σ*. A transducer has* bounded cross-width *if for some integer* k*, all synchronized pairs associated with successful runs of* T *have cross-width at most* k*.*

For instance, the transducer T<sup>2</sup> in Example 3 has unbounded cross-width. In contrast, the transducer T<sup>1</sup> in Example 2 has cross-width one.

The other key notion of *inversion* will be introduced formally in the next section (page 135), as it requires a few technical definitions. The notion however is very similar in spirit to that of cross, with the difference that a single inversion is sufficient for witnessing a family of crosses with arbitrarily large cross-width.

### **4 Proof overview for Theorem 1**

This section provides an overview of the proof of Theorem 1, and introduces the main ingredients.

We will use flows (a concept inspired from crossing sequences [22,3] and revised in Section 4.1) in order to derive the key notion of inversion. Roughly speaking, an inversion in a run involves two loops that produce outputs in an order that is reversed compared to the order on origins. Inversions were also used in the characterization of one-way definability of two-way transducers under the classical semantics [3]. There, they were used for deriving some combinatorial properties of outputs. Here we are only interested in detecting inversions, and this is a simple task.

Flows will also be used to associate factorization trees with runs (the existence of factorization trees of bounded height was established by the celebrated Simon's factorization theorem [23]). We will use a structural induction on these factorization trees and the assumption that there is no inversion in every run to construct a regular resynchronization witnessing one-way resynchronizability of the transducer at hand.

Another important ingredient underlying the main characterization is given by the notion of dominant output interval (Section 4.2), which is used to formalize the invariant of our inductive construction.

#### **4.1 Flows and inversions**

*Intervals.* An *interval* of a word is a set of consecutive positions in it. An interval is often denoted by I = [i, i ), with i = min(I) and i = max(I) + 1. Given two intervals I = [i, i ) and J = [j, j ), we write I<J if i ≤ j, and we say that I,J are adjacent if i = j. The union of two adjacent intervals I = [i, i ), J = [j, j ), denoted I ·J, is the interval [i, j ) (if I,J are not adjacent, then I ·J is undefined).

*Subruns.* Given a run ρ of a transducer, a *subrun* is a factor of ρ. Note that a subrun of a two-way transducer may visit a position of the input several times. For an input interval I = [i, j) and a run ρ, we say that a subrun ρ of ρ *spans over* I if i (resp. j) is the smallest (resp. greatest) input position labeling some transition of ρ . The left hand-side of the figure at page 134 gives an example of an interval I of an input word together with the subruns α1, α2, α3, β1, β2, β3, γ<sup>1</sup> that span over it. Subruns spanning over an interval can be left-to-right, left-toleft, right-to-left, or right-to-right depending on where the starting and ending positions are w.r.t. the endpoints of the interval.

*Flows.* Flows are used to summarize subruns of a two-way transducer that span over a given interval. The definition below is essentially taken from [3], except for replacing "functional" by "K-visit". Formally, a *flow* of a transducer T is a graph with vertices divided into two groups, L-vertices and R-vertices, labeled by states of T, and with directed edges also divided into two groups, productive and nonproductive edges. The graph satisfies the following requirements. Edge sources are either an L-vertex labeled by a right-reading state, or an R-vertex labeled by a left-reading state, and symmetrically for edge destinations; moreover, edges are of one of the following types: LL, LR, RL, RR. Second, each node is the endpoint of exactly one edge. Finally, L (R, resp.) vertices are totally ordered, in such a way that for every LL (RR, resp.) edge (v, v ), we have v<v . We will only consider flows of K-visiting transducers, so flows with at most 2K vertices. For example, the flow in the left-hand side of the figure at page 134 has six L-vertices on the left, and six R-vertices on the right. The edges α1, α2, α<sup>3</sup> are LL, LR, and RR, respectively.

Given a run ρ of T and an interval I = [i, i ) on the input, *the flow of* ρ *on* I, denoted *flow*ρ(I), is obtained by identifying every configuration at position i (resp. i ) with an L (resp. R) vertex, labeled by the state of the configuration, and every subrun spanning over I with an edge connecting the appropriate vertices (this subrun is called the *witnessing subrun* of the edge of the flow). An edge is said to be *productive* if its witnessing subrun produces non-empty output.

*Flow monoid.* The composition of two flows F and G is defined when the Rvertices of F induce the same sequence of labels as the L-vertices of G. In this case, the composition results in the flow F ·G that has as vertices the L-vertices of F and the R-vertices of G, and for edges the directed paths in the graph obtained by glueing the R-vertices of F with the L-vertices of G so that states are matched. Productiveness of edges is inherited by paths, implying that an edge of F · G is productive if and only if the corresponding path contains at least one edge (from F or G) that is productive. When the composition is undefined, we simply write F · G = ⊥. The above definitions naturally give rise to a *flow monoid* associated with the transducer T, where elements are the flows of T, extended with a dummy element ⊥, and the product operation is given by the composition of flows, with the convention that ⊥ is absorbing. It is easy to verify that for any two adjacent intervals I<J of a run ρ, *flow*ρ(I) · *flow*ρ(J) = *flow*ρ(I · J). We denote by M<sup>T</sup> the *flow monoid* of a K-visiting transducer T.

Let us estimate the size of M<sup>T</sup> . If Q is the set of states of T, there are at most |Q| <sup>2</sup><sup>K</sup> possible sequences of L and R-vertices; and the number of edges (marked as productive or not) is bounded by <sup>2</sup><sup>K</sup> K ·(2K)<sup>K</sup> ·2<sup>K</sup> <sup>≤</sup> (2K+1)<sup>2</sup><sup>K</sup>. Including the dummy element <sup>⊥</sup> in the flow monoid, we get <sup>|</sup>M<sup>T</sup> | ≤ (|Q|·(2K+1))<sup>2</sup><sup>K</sup> +1 =: **<sup>M</sup>**.

*Loops.* A loop of a run ρ over input w is an interval I = [i, j) with a flow F =

The lemma below shows that the occurrence order relative to subruns witnessing LR or RL edges of a loop (called *straight edges*, for short) is preserved when pumping the loop. This seemingly straightforward lemma is needed for detecting inversions and its proof is surprisingly non-trivial. For example, the external edge connecting the two L-vertices 1, 2 in the figure above appears before edge α2, and also before every copy of α<sup>2</sup> in the run where loop I is pumped.

**Lemma 3.** *Let* ρ *be a run of* T *on* u*, let* J<I<K *be a partition of the domain of* u *into intervals, with* I *loop of* ρ*, and let* F = *flow*ρ(J)*,* E = *flow*ρ(I)*, and* G = *flow*ρ(K) *be the corresponding flows. Consider an arbitrary edge* f *of either* F *or* G*, and a straight edge* e *of the idempotent flow* E*. Let* ρ<sup>f</sup> *and* ρ<sup>e</sup> *be the witnessing subruns of* f *and* e*, respectively. Then the occurrence order of* ρ<sup>f</sup> *and* ρ<sup>e</sup> *in* ρ *is the same as the occurrence order of* ρ<sup>f</sup> *and any copy of* ρ<sup>e</sup> *in pump*<sup>n</sup> <sup>I</sup> (ρ)*.*

We can now recall the key notion of inversion:

**Definition 3 (inversion).** *An* inversion *of* ρ *is a tuple* (I, e, I , e ) *such that*

*,*


#### **4.2 Dominant output intervals**

In this section we identify some particular intervals of the output that play an important role in the inductive construction of the resynchronizer for a one-way resynchronizable transducer.

Given <sup>n</sup> <sup>∈</sup> <sup>N</sup>, we say that a set <sup>B</sup> of output positions is <sup>n</sup>*-large* if <sup>|</sup>*orig*(B)<sup>|</sup> <sup>&</sup>gt; n; otherwise, we say that B is n*-small*. Recall that here we work with a Kvisiting transducer <sup>T</sup>, for some constant <sup>K</sup>, and that **<sup>M</sup>** = (|Q|·(2<sup>K</sup> + 1))<sup>2</sup><sup>K</sup> + 1 is an upper bound to the size of the flow monoid M<sup>T</sup> . We will extensively use the derived constant **C** = **M**<sup>2</sup><sup>K</sup> to distinguish between large and small sets of output positions. The intuition behind this constant is that any set of output positions that is **C**-large must traverse a loop of ρ. This is captured by the lemma below. The proof uses algebraic properties of the flow monoid M<sup>T</sup> [20] (see also Theorem 7.2 in [3], which proves a similar result, but with a larger constant derived from Simon's factorization theorem):

**Lemma 4.** *Let* I *be an input interval and* B *a set of output positions with origins inside* I*. If* B *is* **C***-large, then there is a loop* J ⊆ I *of* ρ *such that flow*ρ(J) *contains a productive straight edge witnessed by a subrun that intersects* B *(in particular, out*(J) ∩ B = ∅*).*

We need some more notations for outputs. Given an input interval I we denote by *out*ρ(I) the set of output positions whose origins belong to I (note that this might not be an output interval). An *output block* of I is a maximal interval contained in *out*ρ(I).

The *dominant output interval* of I, denoted *bigout*ρ(I), is the smallest output interval that contains all **C**-large output blocks of I. In particular, *bigout*ρ(I) either is empty or begins with the first **C**-large output block of I and ends with the last **C**-large outblock block of I. We will often omit the subscript ρ from the notations *flow*ρ(I), *out*ρ(I), *bigout*ρ(I), etc., when no confusion arises.

We now fix a successful run ρ of the K-visiting transducer T. The rest of the section presents some technical lemmas that will be used in the inductive constructions for the proof of the main theorem. *In the lemmas below, we assume that all successful runs of* T *(in particular,* ρ*) avoid inversions.*

**Lemma 5.** *Let* I<sup>1</sup> < I<sup>2</sup> *be two input intervals and* B1, B<sup>2</sup> *output blocks of* I1*,* I2*, respectively. If both* B1, B<sup>2</sup> *are* **C***-large, then* B<sup>1</sup> < B2*.*

*Proof (sketch).* If the claim would not hold, then Lemma 4 would provide some loops J<sup>1</sup> ⊆ I<sup>1</sup> and J<sup>2</sup> ⊆ I2, together with some productive edges in them, witnessing an inversion.

**Lemma 6.** *Let* I = I<sup>1</sup> · I2*,* B = *bigout*(I)*, and* B<sup>i</sup> = *bigout*(Ii) *for* i = 1, 2*. Then* B \ (B<sup>1</sup> ∪ B2) *is* 4K**C***-small.*

*Proof (sketch).* By Lemma 5, B<sup>1</sup> < B2. Moreover, all **C**-large output blocks of I<sup>1</sup> or I<sup>2</sup> are also **C**-large output blocks of I, so B contains both B<sup>1</sup> and B2. Suppose, by way of contradiction, that B \ (B<sup>1</sup> ∪ B2) is 4K**C**-large. This means that there is a 2K**C**-large set S ⊆ B\(B1∪B2) with origins entirely to the left of I2, or entirely to the right of I1. Suppose, w.l.o.g., that the former case holds, and decompose S as a union of maximal output blocks B 1, B 2,...,B <sup>n</sup> with origins either entirely inside I1, or entirely outside. Since S ∩ B<sup>1</sup> = ∅, every block B i with origins inside I<sup>1</sup> is **C**-small. Similarly, one can prove that every block B i with origins outside I<sup>1</sup> is **C**-small too. Moreover, since ρ is K-visiting, we get n ≤ 2K. Altogether, this contradicts the assumption that S is 2K**C**-large.

**Lemma 7.** *Let* I = I<sup>1</sup> · I<sup>2</sup> ··· In*, such that* I *is a loop and flow*(I) = *flow*(Ik) *for all* k*. Then bigout*(I) *can be decomposed as* B<sup>1</sup> · J<sup>1</sup> · B<sup>2</sup> · J<sup>2</sup> · ... · J<sup>n</sup>−<sup>1</sup> · Bn*, where*


*Proof (sketch).* The proof idea is similar to the previous lemma. First, using properties of idempotent flows, one shows that all output positions strictly between B<sup>k</sup> and B<sup>k</sup>+1, for any k = 1,...,n−1, have origin in I<sup>k</sup> ∪I<sup>k</sup>+1. Then, one observes that every output block of I<sup>k</sup> disjoint from B<sup>k</sup> is **C**-small, and since

T is K-visiting there are at most K such blocks. This shows that every output interval J<sup>k</sup> between B<sup>k</sup> and B<sup>k</sup>+1 is 2K**C**-small. For an illustration see the figure to the right. The **C**-large blocks in I<sup>1</sup> are shown in red; in blue those for I2, in purple those for I3. So *bigout*(I1) is the entire output between the two red dots, *bigout*(I2) between the two blue dots, and *bigout*(I3) between the purple dots. All three blocks are non-empty, and *bigout*(I<sup>1</sup> · I<sup>2</sup> · I3) goes from the first red to the second purple dot. Black non-dashed arrows stand for **<sup>C</sup>**-small blocks. <sup>I</sup><sup>1</sup> <sup>I</sup><sup>2</sup> <sup>I</sup><sup>3</sup>

#### **5 Proof of Theorem 1**

This section is devoted to proving the characterization of one-way resynchronizability in the bounded-visit case. We will use the notion of *bounded-traversal* from [21], that was shown to characterize the class of bounded regular resynchronizers, in as much as bounded-delay characterizes rational resynchronizers [15].

**Definition 4 (traversal [21]).** *Let* σ = (u, v, *orig*) *and* σ = (u, v, *orig* ) *be two synchronized pairs with the same input and output words.*

*Given two input positions* y, y ∈ dom(u)*, we say that* y traverses y *if there is a pair* (y, z) *of source and target origins associated with the same output position such that* y *is between* y *and* z*, with* y = z *and possibly* y = y*. More precisely:*


*A pair* (σ, σ ) *of synchronized pairs with input* u *and output* v *is said to have* <sup>k</sup>-bounded traversal*, with* <sup>k</sup> <sup>∈</sup> <sup>N</sup>*, if every* <sup>y</sup> <sup>∈</sup> dom(u) *is traversed by at most* <sup>k</sup> *distinct positions of* dom(u)*.*

*A resynchronizer* <sup>R</sup> *has* bounded traversal *if there is some* <sup>k</sup> <sup>∈</sup> <sup>N</sup> *such that every* (σ, σ ) ∈ R *has* k*-bounded traversal.*

**Lemma 8 ([21]).** *A regular resynchronizer is bounded if and only if it has bounded traversal.*

*Proof (of Theorem 1).* First of all, observe that the implication 4 → 1 is straightforward. To prove the implication 1 → 2, assume that there is a k-bounded, regular resynchronizer R that is T-preserving and such that R(T) is orderpreserving. Lemma 8 implies that R has t-bounded traversal, for some constant t. We head towards proving that T has cross-width bounded by t + k. Consider two synchronized pairs σ = (u, v, *orig*) and σ = (u, v, *orig* ) such that σ ∈ [[T]]<sup>o</sup> and (σ, σ ) ∈ R, and consider a cross (X1, X2) of σ. We claim that |*orig*(X1)| or |*orig*(X2)| is at most t + k. Let x<sup>1</sup> = min(*orig*(X1)), x <sup>1</sup> = max(*orig* (X1)), x<sup>2</sup> = max(*orig*(X1)), and x <sup>2</sup> = min(*orig* (X2)). Since (X1, X2) is a cross, we have x<sup>1</sup> > x2, and since σ is order-preserving, we have x <sup>1</sup> ≤ x <sup>2</sup>. Now, if x <sup>1</sup> > x2, then at least |*orig*(X2)| − k input positions from X<sup>2</sup> traverse x <sup>1</sup> to the right (the −k term is due to the fact that at most k input positions can be resynchronized to x <sup>1</sup>). Symmetrically, if x <sup>1</sup> ≤ x2, then at least |*orig*(X1)| − k input positions from X<sup>1</sup> traverse x<sup>2</sup> to the left (the −k term accounts for the case where some positions are resynchronized to x <sup>1</sup> and x <sup>1</sup> = x2). This implies min(|*orig*(X1)|, |*orig*(X2)|) ≤ t + k, as claimed.

The remaining implications rely on the assumption that T is bounded-visit.

The implication 2 → 3 is shown by contraposition: one considers a successful run ρ with an inversion, and shows that crosses of arbitrary width emerge after pumping the loops of the inversion (here Lemma 3 is crucial).

The proof of 3 → 4 is more involved, we only sketch it here. Assuming that no successful run of T has inversions we build a partially bijective, regular resynchronizer R that is T-preserving and R(T) is order-preserving. The resynchronizer R uses some parameters to guess a successful run ρ of T on u and a factorization tree of bounded height for ρ. Formally, a *factorization tree* for a sequence α of monoid elements (e.g. the flows *flow*ρ([y, y]) for all input positions y) is an ordered, unranked tree whose yield is the sequence α. The leaves of the factorization tree are labeled with the elements of α. All other nodes have at least two children and are labeled by the monoid product of the child labels (in our case by the flows of ρ induced by the covered factors in the input). In addition, if a node has more than two children, then all its children must have the same label, representing an idempotent element of the monoid. By Simon's factorization theorem [23], every sequence of monoid elements has some factorization tree of height at most linear in the size of the monoid (in our case, at most 3|M<sup>T</sup> |, see e.g. [8]).

*Parameters.* We use input parameters to encode the successful run ρ and a factorization tree for ρ of height at most H = 3|M<sup>T</sup> |. These parameters specify, for each input interval corresponding to a subtree, the start and end positions of the interval and the label of the root of the subtree. Correctness of these annotations can be enforced by an MSO sentence ipar. The run and the factorization tree also need to be encoded over the output, using output parameters. More precisely, given a level in the tree and an output position, we need to be able to determine the flow and the productive edge that generated that position. We omit the technical details for checking correctness of the output annotation using the formulas opar, move<sup>τ</sup> and nextτ,τ-.

*Moving origins.* For each level of the factorization tree, a partial resynchronization relation R is defined. The relation is partial in the sense that some output positions may not have a source-target origin pair defined at a given level. But once a source-target pair is defined for some output position at a given level, it remains defined for all higher levels.

In the following we write *bigout*(p) for the dominant output interval associated with the input interval I(p) corresponding to a node p in the tree. For every level of the factorization tree, the resynchronizer R will be a partial function from source origins to target origins, and will satisfy the following:


The construction of R is by induction on . For a binary node p at level with children p1, p2, the resynchronizer R inherits the source-origin pairs from level − 1 for output positions that belong to *bigout*(p1) ∪ *bigout*(p2). Note that *bigout*(p1) < *bigout*(p2) by Lemma 5, so R is order-preserving inside *bigout*(p1) ∪ *bigout*(p2). Output positions inside *bigout*(p) \ (*bigout*(p1) ∪ *bigout*(p2)) are moved in an order-preserving manner to one of the extremities of I(p), or to the last position of I(p1). Boundedness of R is guaranteed by Lemma 6.

The case where p is an idempotent node at level with children p1, p2,...,p<sup>n</sup> follows a similar approach. For brevity, let I<sup>i</sup> = I(pi) and B<sup>i</sup> = *bigout*(pi), and observe that, by Lemma 5, B<sup>1</sup> < B<sup>2</sup> < ··· < Bn. Lemma 7 provides a decomposition of *bigout*(p) as B<sup>1</sup> ·J<sup>1</sup> ·B<sup>2</sup> ·J<sup>2</sup> ·...·J<sup>n</sup>−<sup>1</sup> ·Bn, for some 2K**C**-small output intervals J<sup>k</sup> with origins inside I<sup>k</sup> ∪ I<sup>k</sup>+1, for k = 1,...,n − 1. As before, the resynchronizer R behaves exactly as R−<sup>1</sup> for the output positions inside the Bk's. For any other output position, say x ∈ Jk, the resynchronizer R will move the origin either to the last position of I<sup>k</sup> or to the first position of I<sup>k</sup>+1, depending on whether the source origin of x belongs to I<sup>k</sup> or I<sup>k</sup>+1.

#### **6 Proof overview of Theorem 2**

The main obstacle towards dropping the bounded-visit restriction from Theorem 1, while maintaining the effectiveness of the characterization, is the lack of a bound on the number of flows. Indeed, for a transducer T that is not necessarily bounded-visit, there is no bound on the number of flows that encode successful runs of T, and thus the proofs of the implications 2 → 3 → 4 are not applicable anymore. However, the proofs of the implications 1 → 2 and 4 → 1 remain valid, even for a transducer T that is not bounded-visit.

The idea for proving Theorem 2 is to transform T into an equivalent boundedvisit transducer *low*(T), so that the property of one-way resynchronizability is preserved. More precisely, given a two-way transducer T, we construct:


We can apply our characterization of one-way resynchronizability in the bounded-visit case to the transducer *low*(T). If *low*(T) is one-way resynchronizable, then by Theorem 1 we obtain another partially bijective, regular resynchronizer R that is *low*(T)-preserving and such that R (*low*(T))) is order-preserving. Thanks to Lemma 2, the resynchronizers R and R can be composed, so we conclude that the original transducer T is one-way resynchronizable. Otherwise, if *low*(T) is not one-way resynchronizable, we show that neither is T. This is precisely shown in the lemma below.

**Lemma 9.** *For all transducers* T,T *, with* T *bounded-visit, and for every partially bijective, regular resynchronizer* R *that is* T*-preserving and such that* R(T) =<sup>o</sup> T *,* T *is one-way resynchronizable if and only if* T *is one-way resynchronizable.*

There are however some challenges in the approach described above. First, as T may output arbitrarily many symbols with origin in the same input position, and *low*(T) is bounded-visit, we need *low*(T) to be able to produce arbitrarily long outputs within a single transition. For this reason, we allow *low*(T) to be a transducer with *regular outputs*. The transition relation of such a transducer consists of finitely many tuples of the form (q, a, L, q ), with q, q ∈ Q, a ∈ Σ, and L ⊆ Γ<sup>∗</sup> a regular language over the output alphabet. The semantics of a transition rule (q, a, L, q ) is that, upon reading a, the transducer can switch from state q to state q , and move its head accordingly, while outputting any word from L. We also need to use transducers with common guess. Both extensions, regular outputs and common guess, already appeared in prior works (cf. [5,7]), and the proof of Theorem 1 in the bounded-visit case can be easily adapted to these features.

There is still another problem: we cannot always expect that there exists a bounded-visit transducer *low*(T) classically equivalent to T. Consider, for instance, the transducer that performs several passes on the input, and on each left-to-right pass, at an arbitrary input position, it copies as output the letter under its head. It is easy to see that the Parikh image of the output is an exact multiple of the Parikh image of the input, and standard pumping arguments show that no bounded-visit transducer can realize such a relation.

A solution to this second problem is as follows. Before trying to construct *low*(T), we test whether T satisfies the following condition on vertical loops (these are runs starting and ending at the same position and at the same state). There should exist some K such that T is K*-sparse*, meaning that the number of different origins of outputs generated inside some vertical loop is at most K. If this condition is not met, then we show that T has unbounded cross-width, and hence, by the implication 1 → 2 of Theorem 1, T is not one-way resynchronizable. Otherwise, if the condition holds, then we show that a bounded-visit transducer *low*(T) equivalent to T can indeed be constructed.

### **7 Complexity**

We discuss the effectiveness and complexity of our characterization. For a kvisit transducer T, the effectiveness of the characterization relies on detecting inversions in successful runs of T. It is not difficult to see that this can be decided in space that is polynomial in the size of T and the bound k. We can also show that one-way resynchronizability is Pspace-hard. For this we recall that the emptiness problem for two-way finite automata is Pspace-complete. Let A be a two-way automaton accepting some language L, and let Σ be a binary alphabet disjoint from that of L. The function {(w · a<sup>1</sup> ...an, a<sup>n</sup> ...a1) | w ∈ L, a<sup>1</sup> ...a<sup>n</sup> ∈ Σ∗, n ≥ 0} can be realized by a two-way transducer T of size polynomial in |A|, and T is one-way resynchronizable if and only if L is empty.

In the unrestricted case, we showed that one-way resynchronizability is decidable (Theorem 2). We briefly outline the complexity of the decision procedure:


Summing up, one can decide one-way resynchronizability of unrestricted twoway transducers in exponential space. It is open if this bound is optimal. We also do not have any interesting bound on the size of the resynchronizer that witnesses one-way resynchronizability, both in the bounded-visit case and in the unrestricted case. Similarly, we lack upper and lower bounds on the size of the resynchronized one-way transducers, when these exist.

#### **8 Conclusions**

As the main contribution of this paper, we provided a characterization for the subclass of two-way transducers that are one-way resynchronizable, namely, that can be transformed by some bounded, regular resynchronizer, into an originequivalent one-way transducer.

There are similar definability problems that emerge in the origin semantics. For instance, one could ask whether a given two-way transducer can be resynchronized, through some bounded, regular resynchronization, to a relation that is origin-equivalent to a first-order transduction. This can be seen as a relaxation of the first-order definability problem in the origin semantics, namely, the problem of telling whether a two-way transducer is origin-equivalent to some first-order transduction, shown decidable in [4]. It is worth contrasting the latter problem with the challenging open problem whether a given transduction is equivalent to a first-order transduction in the classical setting.

*Acknowledgments.* We thank the FoSSaCS reviewers for their constructive and useful comments.

#### **References**

1. Rajeev Alur and Pavel Cern´y. Expressiveness of streaming string transducer. In IARCS Annual Conference on Foundation of Software Technology and Theoretical

Computer Science (FSTTCS'10), volume 8 of LIPIcs, pages 1–12. Schloss Dagstuhl - Leibniz-Zentrum f¨ur Informatik, 2010.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/ 4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

#### **Fair Refinement for Asynchronous Session Types***-*

Mario Bravetti<sup>1</sup> , Julien Lange2( ) , and Gianluigi Zavattaro<sup>1</sup>

<sup>1</sup> University of Bologna / INRIA FoCUS Team, Bologna, Italy {mario.bravetti,gianluigi.zavattaro}@unibo.it <sup>2</sup> Royal Holloway, University of London, Egham, UK julien.lange@rhul.ac.uk

**Abstract.** Session types are widely used as abstractions of asynchronous message passing systems. Refinement for such abstractions is crucial as it allows improvements of a given component without compromising its compatibility with the rest of the system. In the context of session types, the most general notion of refinement is the asynchronous session subtyping, which allows to anticipate message emissions but only under certain conditions. In particular, asynchronous session subtyping rules out candidates subtypes that occur naturally in communication protocols where, e.g., two parties simultaneously send each other a finite but unspecified amount of messages before removing them from their respective buffers. To address this shortcoming, we study fair compliance over asynchronous session types and fair refinement as the relation that preserves it. This allows us to propose a novel variant of session subtyping that leverages the notion of controllability from service contract theory and that is a sound characterisation of fair refinement. In addition, we show that both fair refinement and our novel subtyping are undecidable. We also present a sound algorithm, and its implementation, which deals with examples that feature potentially unbounded buffering.

**Keywords:** Session types · Asynchronous communication · Subtyping.

### **1 Introduction**

The coordination of software components via message-passing techniques is becoming increasingly popular in modern programming languages and development methodologies based on actors and microservices, e.g., Rust, Go, and the Twelve-Factor App methodology [1]. Often the communication between two concurrent or distributed components takes place over point-to-point fifo channels.

Abstract models such as communicating finite-state machines [5] and asynchronous session types [21] are essential to reason about the correctness of such systems in a rigorous way. In particular these models are important to reason about mathematically grounded techniques to improve concurrent and distributed systems in a compositional way. The key question is whether a component can be *refined* independently of the others, without compromising the

<sup>-</sup> Research partly supported by the H2020-MSCA-RISE project ID 778233 "Behavioural Application Program Interfaces (BEHAPI)".

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 144–163, 2021.

https://doi.org/10.1007/978-3-030-71995-1 8

correctness of the whole system. In the theory of session types, the most general notion of refinement is the asynchronous session subtyping [14, 15, 26], which leverages asynchrony by allowing the refined component to anticipate message emissions, but only under certain conditions. Notably asynchronous session subtyping rules out candidate subtypes that occur naturally in communication protocols where, e.g., two parties simultaneously send each other a finite but unspecified amount of messages before removing them from their buffers.

We illustrate this key limitation of asynchronous session subtyping with Figure 1, which depicts possible communication protocols between a spacecraft and a ground station. For convenience, the protocols are represented as session types (bottom) and equivalent communicating finite-state machines (top). Consider T<sup>S</sup> and T<sup>G</sup> first. Session type T<sup>S</sup> is the abstraction of the spacecraft. It may send a finite but unspecified number of telemetries (*tm*), followed by a message *over* — this phase of the protocol typically models a for loop and its exit. In the second phase, the spacecraft receives a number of telecommands (*tc*), followed by a message *done*. Session type T<sup>G</sup> is the abstraction of the ground station. It is the *dual* of TS, written TS, as required in standard binary session types without subtyping. Since T<sup>G</sup> and T<sup>S</sup> are dual of each other, the theory of session types guarantees that they form a *correct composition*, namely both parties terminate successfully, with empty queues.

However, it is clear that this protocol is not efficient: the communication is half-duplex, i.e., it is never the case that more than one party is sending at any given time. Using full-duplex communication is crucial in distributed systems with intermittent connectivity, e.g., in this case ground stations are not always visible from low orbit satellites.

The abstraction of a more efficient ground station is given by type T <sup>G</sup>, which sends telecommands before receiving telemetries. It is clear that T <sup>G</sup> and T<sup>S</sup> forms a correct composition. Unfortunately T <sup>G</sup> is not an asynchronous subtype of T<sup>G</sup> according to earlier definitions of session subtyping [14,15,26]. Hence they cannot formally guarantee that T <sup>G</sup> is a safe replacement for TG. Concretely, these subtyping relations allow for anticipation of emissions (output) only when they are preceded by a *bounded* number of receptions (input), but this does not hold between T <sup>G</sup> and T<sup>G</sup> because the latter starts with a loop of inputs. Note that the composition of T <sup>G</sup> and T<sup>S</sup> is not existentially bounded, hence it cannot be verified by related communicating finite-state machines techniques [4,19,20,24].

In this paper we address this limitation of previous asynchronous session subtyping relations. To do this, we move to an alternative notion of correct composition. In [14] the authors show that their subtyping relation is fully abstract w.r.t. the notion of *orphan-message-free* composition. More precisely, it captures exactly a notion of refinement that preserves the possibility for all sent messages to be consumed along *all* possible computations of the receiver. In the spacecraft example, given the initial loop of outputs in T <sup>G</sup>, there is an extreme case in which it performs infinitely many outputs without consuming any incoming messages. Nevertheless, this limit case cannot occur under the natural assumption that

**Fig. 1.** Satellite protocols. T- <sup>G</sup> is the refined session type of the ground station, T<sup>G</sup> is the session type of ground station, and T<sup>S</sup> is the session type of the spacecraft.

the loop of outputs eventually terminates, i.e., only a finite (but unspecified) amount of messages can be emitted.

The notion of correct composition that we use is based on *fair* compliance, which requires each component to always be able to eventually reach a successful final state. This is a liveness property, holding under *full fairness* [32], used also in the theory of should testing [30] where "every reachable state is required to be on a path to success". This is a natural constraint since even programs that conceptually run indefinitely must account for graceful termination (e.g., to release acquired resources). Previously, fair compliance has been considered to reason formally about component/service composition with *synchronous* session types [29] and *synchronous* behavioural contracts [11]. A preliminary formalisation of fair compliance for *asynchronous* behavioural contracts was presented in [10], but considering an operational model very different from session types.

Given a notion of fair compliance defined on an operational model for asynchronous session types, we define *fair refinement* as the relation that preserves it. Then, we propose a novel variant of session subtyping called *fair asynchronous session subtyping*, that leverages the notion of controllability from service contract theory, and which is a sound characterisation of fair refinement. We show that both fair refinement and fair asynchronous session subtyping are undecidable, but give a sound algorithm for the latter. Our algorithm covers session types that exhibit complex behaviours (including the spacecraft example and variants). Our algorithm has been implemented in a tool available online [31].

*Structure of the paper* The rest of this paper is structured as follows. In § 2 we recall syntax and semantics of asynchronous session types, we define *fair compliance* and the corresponding *fair refinement*. In § 3 we introduce *fair asynchronous subtyping*, the first relation of its kind to deal with examples such as those in Figure 1. In § 4 we propose a sound algorithm for subtyping that supports examples with unbounded accumulations, including the ones discussed in this paper. In § 5 we discuss the implementation of this algorithm. Finally, in § 6 we discuss related works and future work. We give proofs for all our results and examples of output from our tool in [9].

#### **2 Refinement for Asynchronous Session Types**

In this section we first recall the syntax of two-party session types, their reduction semantics, and a notion of compliance centred on the successful termination of interactions. We define our notion of refinement based on this compliance and show that it is generally undecidable whether a type is a refinement of another.

#### **2.1 Preliminaries: Asynchronous Session Types**

*Syntax* The formal syntax of two-party session types is given below. We follow the simplified notation used in, e.g., [7,8], without dedicated constructs for sending an output/receiving an input. Additionally we abstract away from message payloads since they are orthogonal to the results of this paper.

**Definition 1 (Session Types).** *Given a set of labels* <sup>L</sup>*, ranged over by* <sup>l</sup>*, the syntax of two-party session types is given by the following grammar:*

<sup>T</sup> ::= ⊕{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> <sup>|</sup> &{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> <sup>|</sup> <sup>μ</sup>**t**.T <sup>|</sup> **<sup>t</sup>** <sup>|</sup> **end**

Output selection ⊕{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> represents a guarded internal choice, specifying that a label l<sup>i</sup> is sent over a channel, then continuation T<sup>i</sup> is executed. Input branching &{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> represents a guarded external choice, specifying a protocol that waits for messages. If message l<sup>i</sup> is received, continuation T<sup>i</sup> takes place. In selections and branchings each branch is tagged by a label li, taken from a global set of labels L. In each selection/branching, these labels are assumed to be pairwise distinct. In the sequel, we leave implicit the index set <sup>i</sup> <sup>∈</sup> <sup>I</sup> in input branchings and output selections when it is clear from the context. Types μ**t**.T and **t** denote standard recursion constructs. We assume recursion to be guarded in session types, i.e., in μ**t**.T, the recursion variable **t** occurs within the scope of a selection or branching. Session types are closed, i.e., all recursion variables **t** occur under the scope of a corresponding binder μ**t**.T. Terms of the session syntax that are not closed are dubbed (session) terms. Type **end** denotes the end of the interactions.

The dual of session type T, written T, is inductively defined as follows: ⊕{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> <sup>=</sup> &{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> , &{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> <sup>=</sup> ⊕{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> , **end** <sup>=</sup> **end**, **<sup>t</sup>** <sup>=</sup> **<sup>t</sup>**, and μ**t**.T = μ**t**.T.

*Operational characterisation* Hereafter, we let <sup>ω</sup> range over words in <sup>L</sup><sup>∗</sup>, write for the empty word, and write <sup>ω</sup>1·ω<sup>2</sup> for the concatenation of words <sup>ω</sup><sup>1</sup> and <sup>ω</sup>2, where each word may contain zero or more labels. Also, we write <sup>T</sup>{<sup>T</sup>- /**<sup>t</sup>**} for <sup>T</sup> where every free occurrence of **t** is replaced by T .

We give an asynchronous semantics of session types via transition systems whose states are configurations of the form: [T1, ω1]|[T2, ω2] where <sup>T</sup><sup>1</sup> and <sup>T</sup><sup>2</sup> are session types equipped with two sequences ω<sup>1</sup> and ω<sup>2</sup> of incoming messages (representing unbounded buffers). We use s, s , etc. to range over configurations.

In this paper, we use explicit unfoldings of session types, as defined below.

**Definition 2 (Unfolding).** *Given session type* T*, we define* unfold(T)*:*

$$\mathsf{unfold}(T) = \begin{cases} \mathsf{unfold}(T' \{ ^t \} ) & \text{if } T = \mu \mathsf{t}. T'\\ T & \text{otherwise} \end{cases}$$

Definition 2 is standard, e.g., an equivalent function is used in the first session subtyping [18]. Notice that unfold(T) unfolds all the recursive definitions in front of T, and it is well defined for session types with guarded recursion.

**Definition 3 (Transition Relation).** *The transition relation* → *over configurations is the minimal relation satisfying the rules below (plus symmetric ones):*

*1. if* <sup>j</sup> <sup>∈</sup> <sup>I</sup> *then* [⊕{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> , ω1]|[T2, ω2] <sup>→</sup> [T<sup>j</sup> , ω1]|[T2, ω<sup>2</sup> ·l<sup>j</sup> ]*; 2. if* <sup>j</sup> <sup>∈</sup> <sup>I</sup> *then* [&{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> , l<sup>j</sup> ·ω1]|[T2, ω2] <sup>→</sup> [T<sup>j</sup> , ω1]|[T2, ω2]*; 3. if* [unfold(T1), ω1]|[T2, ω2] <sup>→</sup> <sup>s</sup> *then* [T1, ω1]|[T2, ω2] <sup>→</sup> <sup>s</sup>*.*

*We write* →<sup>∗</sup> *for the reflexive and transitive closure of the* → *relation.*

Intuitively a configuration s reduces to configuration s when either (1) a type outputs a message l<sup>j</sup> , which is added at the end of its partner's queue; (2) a type consumes an expected message l<sup>j</sup> from the head of its queue; or (3) the unfolding of a type can execute one of the transitions above.

Next, we define successful configurations as those configurations where both types have terminated (reaching **end**) and both queues are empty. We use this to give our definition of compliance which holds when it is possible to reach a successful configuration from all reachable configurations.

**Definition 4 (Successful Configuration).** *The notion of successful configuration is formalised by a predicate* s √ *defined as follows:*

[T,ω<sup>T</sup> ]|[S, ωS] √ *iff* unfold(T)=unfold(S)=**end** *and* ω<sup>T</sup> =ω<sup>S</sup> =

**Definition 5 (Compliance).** *Given a configuration* s *we say that it is a correct composition if, whenever* <sup>s</sup> <sup>→</sup><sup>∗</sup> <sup>s</sup> *, there exists a configuration* s *such that* <sup>s</sup> <sup>→</sup><sup>∗</sup> <sup>s</sup> *and* <sup>s</sup>√*.*

*Two session types* <sup>T</sup> *and* <sup>S</sup> *are* compliant *if* [T, ]|[S, ] *is a correct composition.*

Observe that our definition of compliance is stronger than what is generally considered in the literature on session types, e.g., [16, 23, 24], where two types are deemed compliant if all messages that are sent are eventually received, and each non-terminated type can always eventually make a move. Compliance is analogous to the notion of *correct session* in [29] but in an asynchronous setting.

A consequence of Definition 5 is that it is generally *not* the case that a session type T is compliant with its dual T, as we show in the example below.

*Example 1.* The session type <sup>T</sup> <sup>=</sup> &{l<sup>1</sup> : **end**, l<sup>2</sup> : <sup>μ</sup>**t**. ⊕ {l<sup>3</sup> : **<sup>t</sup>**}} and its dual <sup>T</sup> <sup>=</sup> ⊕{l<sup>1</sup> : **end**, l<sup>2</sup> : <sup>μ</sup>**t**.&{l<sup>3</sup> : **<sup>t</sup>**}} are not compliant. Indeed, when <sup>T</sup> sends label <sup>l</sup>2, the configuration [**end**, ]|[**end**, ] is no longer reachable.

#### **2.2 Fair Refinement for Asynchronous Session Types**

We introduce a notion of refinement that preserves compliance. This follows previous work done in the context of behavioural contracts [11] and *synchronous* multi-party session types [29]. The key difference with these works is that we are considering asynchronous communication based on (unbounded) fifo queues. Asynchrony makes fair refinement undecidable, as we show below.

**Definition 6 (Refinement).** *A session type* <sup>T</sup> *refines* <sup>S</sup>*, written* <sup>T</sup> <sup>S</sup>*, if for every* S *s.t.* S *and* S *are compliant then* T *and* S *are also compliant.*

In contrast to traditional (synchronous and asynchronous) subtyping for session types [14, 18, 26], this refinement is not covariant on outputs, i.e., it does not always allow a refined type to have output selections with less labels.<sup>3</sup>

*Example 2.* Let <sup>T</sup> <sup>=</sup> <sup>μ</sup>**t**. ⊕ {l<sup>1</sup> : **<sup>t</sup>**} and <sup>S</sup> <sup>=</sup> <sup>μ</sup>**t**. ⊕ {l<sup>1</sup> : **<sup>t</sup>**, l<sup>2</sup> : **end**}. We have that T is a synchronous (and asynchronous) subtype of S. However T is *not* a refinement of <sup>S</sup>. In particular, the type <sup>S</sup> <sup>=</sup> <sup>μ</sup>**t**. &{l<sup>1</sup> : **<sup>t</sup>**, l<sup>2</sup> : **end**} is compliant with S but not with T, since T does not terminate.

Next, we show that the refinement relation is generally undecidable. The proof of undecidability exploits results from the tradition of computability theory, i.e., Turing completeness of queue machines. The crux of the proof is to reduce the problem of checking the reachability of a given state in a queue machine to the problem of checking the refinement between two session types.

*Preliminaries* Below we consider only state reachability in queue machines, and not the typical notion of the language recognised by a queue machine (see, e.g., [7] for a formalisation of queue machines). Hence, we use a simplified formalisation, where no input string is considered.

**Definition 7 (Queue Machine).** *A queue machine* M *is defined by a six-tuple* (Q, Σ, Γ, \$, s, δ) *where:*


Considering a queue machine M = (Q, Σ, Γ, \$, s, δ), a *configuration* of M is an ordered pair (q, γ) where <sup>q</sup> <sup>∈</sup> <sup>Q</sup> is its *current state* and <sup>γ</sup> <sup>∈</sup> <sup>Γ</sup><sup>∗</sup> is the *queue*. The starting configuration is (s, \$), composed of the start state s and the initial queue symbol \$.

Next, we define the transition relation (→<sup>M</sup>), leading a configuration to another, and the related notion of state reachability.

<sup>3</sup> The synchronous subtyping in [18] follows a channel-oriented approach; hence it has the opposite direction and is contravariant on outputs.

**Definition 8 (State Reachability).** *Given a machine* M = (Q, Σ, Γ, \$, s, δ)*, the transition relation* <sup>→</sup><sup>M</sup> *over configurations* <sup>Q</sup> <sup>×</sup> <sup>Γ</sup><sup>∗</sup> *is defined as follows. For* p, q <sup>∈</sup> <sup>Q</sup>*,* <sup>A</sup> <sup>∈</sup> <sup>Γ</sup>*, and* α, γ <sup>∈</sup> <sup>Γ</sup>∗*, we have* (p, Aα) <sup>→</sup><sup>M</sup> (q, αγ) *whenever* <sup>δ</sup>(p, A)=(q, γ)*. Let* <sup>→</sup><sup>∗</sup> <sup>M</sup> *be the reflexive and transitive closure of* →<sup>M</sup>*.*

*A target state* <sup>q</sup><sup>f</sup> <sup>∈</sup> <sup>Q</sup> *is* reachable *in* <sup>M</sup> *if there is* <sup>γ</sup> <sup>∈</sup> <sup>Γ</sup><sup>∗</sup> *s.t.* (s, \$) <sup>→</sup><sup>∗</sup> <sup>M</sup> (q<sup>f</sup> , γ)*.*

Since queue machines can deterministically encode Turing machines (see, e.g., [7]), checking state reachability for queue machines is undecidable.

**Theorem 1.** *Given a queue machine* M *and a target state* q<sup>f</sup> *it is possible to reduce the problem of checking the reachability of* q<sup>f</sup> *in* M *to the problem of checking refinement between two session types.*

In the light of the undecidability of reachability in queue machines, we can conclude that refinement (Definition 6) is also undecidable.

#### **2.3 Controllability for Asynchronous Session Types**

Given a notion of compliance, controllability amounts to checking the existence of a compliant partner (see, e.g., [12, 25, 33]). In our setting, a session type is *controllable* if there exists another session type with which it is compliant.

Checking for controllability algorithmically is not trivial as it requires to consider infinitely many potential partners. For the synchronous case, an algorithmic characterisation was studied in [29]. In the asynchronous case, the problem is even harder because each of the infinitely many potential partners may generate an infinite state computation (due to unbounded buffers). The main contribution of this subsection is to give an algorithmic characterisation of controllability in the asynchronous setting. Doing this is important because controllability is an essential ingredient for defining fair asynchronous subtyping, see Section 3.

**Definition 9 (Characterisation of Controllability,** T ctrl**).** *Given a session type* T*, we define the judgement* T ok *inductively as follows:*

$$\begin{array}{ccccc} \textbf{send} & \textbf{end} \in T & T\{\textbf{end}/\textbf{t}\}\,\textbf{ok} & \begin{array}{c} T\,\textbf{ok} \\ \mu\textbf{t}.T\,\textbf{ok} \end{array} \\ \end{array} \qquad \begin{array}{c} T\,\textbf{ok} \\ \dfrac{T\,\textbf{ok}}{\&\{l:T\}\,\textbf{ok}} & \begin{array}{c} \forall i \in I.\,\,T\_{i}\,\textbf{ok} \\ \oplus \{l\_{i}:T\_{i}\}\_{i\in I}\,\textbf{ok} \end{array} \end{array}$$

*where* **end** <sup>∈</sup> <sup>T</sup> *holds if* **end** *occurs in* <sup>T</sup>*.*

*We write* T ctrl *if there exists* T *such that (*i*)* T *is obtained from* T *by syntactically replacing every input prefix* &{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> *occurring in* <sup>T</sup> *with a term* &{l<sup>j</sup> : <sup>T</sup><sup>j</sup>} *(with* <sup>j</sup> <sup>∈</sup> <sup>I</sup>*) and (*ii*)* <sup>T</sup> <sup>o</sup><sup>k</sup> *holds.*

Notice that a type T such that T ctrl is indeed controllable, in that T , the dual of type <sup>T</sup> considered above, is compliant with <sup>T</sup> (the predicate **end**∈<sup>T</sup> in the premise of the rule for recursion guarantees that a successful configuration is always reachable while looping). Moreover the above definition naturally yields a simple algorithm that decides whether or not T ctrl holds for a type T, i.e., we first pick a single branch for each input prefix syntactically occurring in T (there are finitely many of them) and then we inductively check if T ok holds.

The following theorem shows that the judgement T ctrl, as defined above, precisely characterises controllability (i.e., the existence of a compliant type).

**Theorem 2.** T ctrl *holds if and only if there exists a session type* S *such that* T *and* S *are compliant.*

*Example 3.* Consider the session type <sup>T</sup> <sup>=</sup> <sup>μ</sup>**t**. &{l<sup>1</sup> : &{l<sup>2</sup> : ⊕{l<sup>4</sup> : **end**, l<sup>5</sup> : μ**t** . ⊕ {l<sup>6</sup> : **<sup>t</sup>** }}, l<sup>3</sup> : **<sup>t</sup>**}}. <sup>T</sup> ctrl does *not* hold because it is not possible to construct a T as specified in Definition 9 for which T ok holds. By Theorem 2, there is no session type S that is compliant with T. Hence T is not controllable.

#### **3 Fair Asynchronous Session Subtyping**

In this section, we present our novel variant of asynchronous subtyping which we dub *fair asynchronous subtyping*.

We need to define a distinctive notion of unfolding. Function selUnfold(T) unfolds type T by replacing recursion variables with their corresponding definitions only if they are guarded by an output selection. In the definition, we use the predicate <sup>⊕</sup>*g*(**t**, T) which holds if all instances of variable **<sup>t</sup>** are output selection guarded, i.e., **<sup>t</sup>** occurs free in <sup>T</sup> only inside subterms ⊕{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> .

**Definition 10 (Selective Unfolding).** *Given a term* T*, define* selUnfold(T) =

⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎩ ⊕{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> *if* <sup>T</sup> <sup>=</sup> ⊕{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> &{l<sup>i</sup> : selUnfold(Ti)}<sup>i</sup>∈<sup>I</sup> *if* <sup>T</sup> <sup>=</sup> &{l<sup>i</sup> : <sup>T</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> T {<sup>μ</sup>**t**.T- /**<sup>t</sup>**} *if* <sup>T</sup> <sup>=</sup> <sup>μ</sup>**t**.T *,* <sup>⊕</sup>*g*(**t**, T ) μ**t**.selUnfold(selRepl(**t**,**ˆt**, T ){<sup>μ</sup>**t**.T- /**ˆt**}) *with* **<sup>ˆ</sup><sup>t</sup>** *fresh if* <sup>T</sup> <sup>=</sup> <sup>μ</sup>**t**.T *,* <sup>¬</sup> <sup>⊕</sup> *<sup>g</sup>*(**t**, T ) **t** *if* T = **t end** *if* T = **end**

*where,* selRepl(**t**,**ˆt**, T ) *is obtained from* T *by replacing the free occurrences of* **t** *that are inside a subterm* ⊕{l<sup>i</sup> : <sup>S</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> *of* <sup>T</sup> *by* **<sup>ˆ</sup>t***.*

*Example 4.* Consider the type <sup>T</sup> <sup>=</sup> <sup>μ</sup>**t**.&{l<sup>1</sup> : **<sup>t</sup>**, l<sup>2</sup> : ⊕{l<sup>3</sup> : **<sup>t</sup>**}}, then we have

$$\mathsf{selUnfold}(T) = \mu\mathfrak{t}.\&\{l\_1:\mathsf{t},\ l\_2:\oplus\{l\_3:\mu\mathsf{t}.\&\{l\_1:\mathsf{t},\ l\_2:\oplus\{l\_3:\mathsf{t}\}\}\}\}$$

i.e., the type is only unfolded within output selection sub-terms. Note that **ˆt** is used to identify where unfolding must take place, e.g., selRepl(**t**,**ˆt**, &{l<sup>1</sup> : **<sup>t</sup>**, l<sup>2</sup> : ⊕{l<sup>3</sup> : **<sup>t</sup>**}}) = &{l<sup>1</sup> : **<sup>t</sup>**, l<sup>2</sup> : ⊕{l<sup>3</sup> : **<sup>ˆ</sup>t**}}.

The last auxiliary notation required to define our notion of subtyping is that of *input contexts*, which are used to record inputs that may be delayed in a candidate super-type.

**Definition 11 (Input Context).** *An input context* A *is a session type with several holes defined by the syntax:*

<sup>A</sup> ::= [ ]<sup>k</sup> <sup>|</sup> &{l<sup>i</sup> : <sup>A</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> <sup>|</sup> <sup>μ</sup>**t**.A | **<sup>t</sup>**

*where the holes* [ ]<sup>k</sup>*, with* <sup>k</sup> <sup>∈</sup> <sup>K</sup>*, of an input context* <sup>A</sup> *are assumed to be pairwise distinct. We assume that recursion is guarded, i.e., in an input context* <sup>μ</sup>**t**.A*, the recursion variable* **<sup>t</sup>** *must occur within a subterm* &{l<sup>i</sup> : <sup>A</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> *.*

*We write holes*(A) *for the set of hole indices in* <sup>A</sup>*. Given a type* <sup>T</sup><sup>k</sup> *for each* <sup>k</sup> <sup>∈</sup> <sup>K</sup>*, we write* <sup>A</sup>[Tk] <sup>k</sup>∈<sup>K</sup> *for the type obtained by filling each hole* <sup>k</sup> *in* <sup>A</sup> *with the corresponding* Tk*.*

In contrast to previous work [6,7,13–15,26], these input contexts may contain recursive constructs. This is crucial to deal with examples such as Figure 1.

We are now ready to define the *fair asynchronous subtyping* relation, written ≤. The rationale behind asynchronous session subtyping is that under asynchronous communication it is unobservable whether or not an output is anticipated before an input, as long as this output is executed along all branches of the candidate super-type. Besides the usage of our new recursive input contexts the definition of fair asynchronous subtyping differs from those in [6,7,13–15,26] in that controllability plays a fundamental role: the subtype is not required to mimic supertype inputs leading to uncontrollable behaviours.

### **Definition 12 (Fair Asynchronous Subtyping,** ≤**).**

*A relation* R *on session types is a controllable subtyping relation whenever* (T,S) ∈ R *implies:*


<sup>T</sup> *is a controllable subtype of* <sup>S</sup> *if there is a controllable subtyping relation* <sup>R</sup> *s.t.* (T,S) ∈ R*.*

<sup>T</sup> *is a* fair asynchronous subtype *of* <sup>S</sup>*, written* <sup>T</sup> <sup>≤</sup> <sup>S</sup>*, whenever:* <sup>S</sup> *controllable implies that* T *is a controllable subtype of* S*.*

Notice that the top-level check for controllability in the above definition is consistent with the inner controllability checks performed in Case (3).

*Subtyping simulation game* Session type T is a fair asynchronous subtype of S if S is not controllable or if T is a controllable subtype of S. Intuitively, the above co-inductive definition says that it is possible to play a simulation game between a subtype T and its supertype S as follows. Case (1) says that if T is the **end** type, then S must also be **end**. Case (2) says that if T is a recursive definition, then it simply unfolds this definition while S does not need to reply. Case (3) says that if T is an input branching, then the sub-terms in S that are controllable can reply by inputting at most some of the labels l<sup>i</sup> in the branching (contravariance of inputs), and the simulation game continues (see Example 5). Case (4) says that if T is an output selection, then S can reply by outputting *all* the labels l<sup>i</sup> in the selection, possibly after executing some inputs, after which the simulation game continues. We comment further on Case (4) with Example 6.

*Example 5.* Consider <sup>T</sup> <sup>=</sup> &{l<sup>1</sup> : **end**, l<sup>2</sup> : **end**} and <sup>S</sup> <sup>=</sup> &{l<sup>1</sup> : **end**, l<sup>3</sup> : <sup>μ</sup>**t**.⊕{l<sup>4</sup> : **<sup>t</sup>**}}. We have <sup>T</sup> <sup>≤</sup> <sup>S</sup>. Once branch <sup>l</sup>3, that is uncontrollable, is removed from <sup>S</sup>, we can apply contravariance for input branching. We have <sup>I</sup> <sup>=</sup> {1, <sup>2</sup>} <sup>⊇</sup> {1} <sup>=</sup> <sup>K</sup> in Definition 12.

*Example 6.* Consider T<sup>G</sup> and T <sup>G</sup> from Figure 1. For the pair (T <sup>G</sup>, TG), we apply Case (4) of Definition 12 for which we compute

$$\mathsf{selUnfold}(T\_G) = \mathcal{A}[\oplus \{tc : \mu \mathbf{t}'. \oplus \{tc : \mathbf{t}', done : \mathbf{end} \}, done : \mathbf{end}]^\top$$

with <sup>A</sup> <sup>=</sup> <sup>μ</sup>**t**.&{*tm* : **<sup>t</sup>**, *over* :[]<sup>1</sup>}. Observe that <sup>A</sup> contains a recursive sub-term, such contexts are not allowed in previous works [14, 15, 26].

The use of selective unfolding makes it possible to express T<sup>G</sup> in terms of a *recursive* input context A with holes filled by types (i.e., closed terms) that start with an output prefix. Indeed selective unfolding does not unfold the recursion variable **t** (*not* guarded by an output selection), which becomes part of the input context A. Instead it unfolds the recursion variable **t** (which is guarded by an output selection) so that the term that fills the hole, which is required to start with an output prefix, is a closed term.

Case (4) of Definition 12 requires us to check that the following pairs are in the relation: (i) (T <sup>G</sup>, <sup>A</sup>[μ**t** . ⊕ {*tc* : **<sup>t</sup>** , *done* : **end**}]) and (ii) (μ**t** . &{*tm* : **t** , *over* : **end**}, <sup>A</sup>[**end**]). Observe that <sup>T</sup><sup>G</sup> <sup>=</sup> <sup>A</sup>[μ**t** . ⊕ {*tc* : **<sup>t</sup>** , *done* : **end**}]. Hence, we have T <sup>G</sup> <sup>≤</sup> <sup>T</sup><sup>G</sup> with

<sup>R</sup>={(T <sup>G</sup>, TG),(**end**,**end**),(μ**t** .&{*tm* : **<sup>t</sup>** , *over* : **end**}, μ**t**.&{*tm* : **<sup>t</sup>**, *over* : **end**})}

and R is a controllable subtyping relation.

We show that fair asynchronous subtyping is sound w.r.t. fair refinement. In fact, fair asynchronous subtyping can be seen as a sound coinductive characterisation of fair refinement. Namely this result gives an operational justification to the syntactical definition of fair asynchronous session subtyping. Note that ≤ is not complete w.r.t. , see Example 7.

**Theorem 3.** *Given two session types* <sup>T</sup> *and* <sup>S</sup>*, if* <sup>T</sup> <sup>≤</sup> <sup>S</sup> *then* <sup>T</sup> <sup>S</sup>*.*

*Example 7.* Let <sup>T</sup> <sup>=</sup> ⊕{l<sup>1</sup> : &{l<sup>3</sup> : **end**}} and <sup>S</sup> <sup>=</sup> &{l<sup>3</sup> :⊕{l<sup>1</sup> : **end**, l<sup>2</sup> : **end**}}. We have <sup>T</sup> <sup>S</sup>, but <sup>T</sup> is not a fair asynchronous subtype of <sup>S</sup> since {l1} <sup>=</sup> {l1, l2}, i.e., covariance of outputs is not allowed.

Unfortunately, fair asynchronous session subtyping is also undecidable. The proof is similar to the one of undecidability of fair refinement, in particular we proceed by reduction from the termination problem in queue machines.

**Theorem 4.** *Given two session types* T *and* S*, it is in general undecidable to check whether* <sup>T</sup> <sup>≤</sup> <sup>S</sup>*.*

#### **4 A Sound Algorithm for Fair Asynchronous Subtyping**

We propose an algorithm which soundly verifies whether a session type is a fair asynchronous subtype of another. The algorithm relies on building a tree whose nodes are labelled by configurations of the simulation game induced by Definition 12. The algorithm analyses the tree to identify *witness* subtrees which contain input contexts that are growing following a recognisable pattern.

*Example 8.* Recall the satellite communication example (Figure 1). The spacecraft with protocol T<sup>S</sup> may be a replacement for an older generation of spacecraft which follows the more complicated protocol T <sup>S</sup>, see Figure 2. Type T <sup>S</sup> notably allows the reception of telecommands to be interleaved with the emission of telemetries. The new spacecraft may safely replace the old one because <sup>T</sup><sup>S</sup> <sup>≤</sup> <sup>T</sup> S.

However, checking <sup>T</sup><sup>S</sup> <sup>≤</sup> <sup>T</sup> <sup>S</sup> leads to an infinite accumulation of input contexts, hence it requires to consider infinitely many pairs of session types. E.g., after T<sup>S</sup> selects the output label *tm* twice, the subtyping simulation game considers the pair (TS, T <sup>S</sup> ), where also T <sup>S</sup> is in Figure 2. The pairs generated for this example illustrate a common recognisable pattern where some branches grow infinitely (the *tc*-branch), while others stay stable throughout the derivation (the *done*-branch). The crux of our algorithm is to use a finite parametric characterisation of the infinitely many pairs occurring in the check of <sup>T</sup><sup>S</sup> <sup>≤</sup> <sup>T</sup> S.

The *simulation tree* for <sup>T</sup> <sup>≤</sup> <sup>S</sup>, written *simtree*(T,S), is the labelled tree representing the simulation game for <sup>T</sup> <sup>≤</sup> <sup>S</sup>, i.e., *simtree*(T,S) is a tuple (N, n0, , λ) where <sup>N</sup> is its set of nodes, <sup>n</sup><sup>0</sup> <sup>∈</sup> <sup>N</sup> is its root, is its transition function, and λ is its labelling function, such that λ(n0)=(S, T). We omit the formal definition of , as it is straightforward from Definition 12 following the subtyping simulation game discussed after that definition. We give an example below.

Notice that the simulation tree *simtree*(T,S) is defined only when S is controllable, since <sup>T</sup> <sup>≤</sup> <sup>S</sup> holds without needing to play the subtyping simulation game if S is not controllable. We say that a branch of *simtree*(T,S) is *successful* if it is infinite or if it finishes in a leaf labelled by (**end**, **end**). All other branches are *unsuccessful*. Under the assumption that S is controllable, we have that all branches of *simtree*(T,S) are successful if and only if <sup>T</sup> <sup>≤</sup> <sup>S</sup>. As a consequence checking whether all branches of *simtree*(T,S) are successful is generally undecidable. It is possible to identify a branch as successful if it visits finitely many pairs (or node labels), see Example 6; but in general a branch may generate infinitely many pairs, see Examples 8 and 12.

In order to support types that generate unbounded accumulation, we characterise finite subtrees — called witness subtrees, see Definition 13 — such that all the branches that traverse these finite subtrees are guaranteed to be successful.

*Notation* We give a few auxiliary definitions and notations. Hereafter A and A range over *extended* input contexts, i.e., input contexts that may contain distinct holes with the same index. These are needed to deal with unfoldings of input contexts, see Example 9.

3 2 1 0 4 5 ?t<sup>c</sup> ?don<sup>e</sup> !t<sup>m</sup> !over ?tc ?done !tm !over T- <sup>S</sup> = μ**t** .& - tc : ⊕{tm : **t**, over : μ**t**- . &{tc : **t**- , done : **end**}}, done : μ**t**--. ⊕ {tm : **t**--, over : **end**} T-- <sup>S</sup> = &- tc : &{ tc : T- S, done : μ**t**--. ⊕ {tm : **t**--, over : **end**} }, done : μ**t**--. ⊕ {tm : **t**--, over : **end**}

The set of *reductions* of an input context <sup>A</sup> is the minimal set <sup>S</sup> s.t. (i) A∈S; (ii) if &{l<sup>i</sup> : <sup>A</sup><sup>i</sup>}<sup>i</sup>∈<sup>I</sup> ∈ S then <sup>∀</sup><sup>i</sup> <sup>∈</sup> I.A<sup>i</sup> ∈ S and (iii) if <sup>μ</sup>**t**.A ∈ S then A {<sup>μ</sup>**t**.A- /**<sup>t</sup>**}∈S. Notice that due to unfolding (item (iii)), the reductions of an input context may contain extended input contexts. Moreover, given a reduction A of A, we have that *holes*(A ) ⊆ *holes*(A).

*Example 9.* Consider the following extended input contexts:

$$\mathcal{A}\_1 = \mu \mathbf{t}. \& \{l\_1 : []^1, \ l\_2 : \& \{l\_3 : \mathbf{t}\}\} \qquad \mathcal{A}\_2 = \& \{l\_3 : \mu \mathbf{t}. \& \{l\_1 : []^1, \ l\_2 : \& \{l\_3 : \mathbf{t}\}\}\}$$

$$\text{unfold}(\mathcal{A}\_1) = \& \{l\_1 : []^1, \ l\_2 : \& \{l\_3 : \mu \mathbf{t}. \& \{l\_1 : []^1, \ l\_2 : \& \{l\_3 : \mathbf{t}\}\}\}$$

Context A<sup>2</sup> is a reduction of A1, i.e., one can reach A<sup>2</sup> from A1, by unfolding <sup>A</sup><sup>1</sup> and executing the input <sup>l</sup>2. Context unfold(A1) is also a reduction of <sup>A</sup>1. Observe that unfold(A1) contains two distinct holes indexed by 1.

Given an extended context <sup>A</sup> and a set of hole indices <sup>K</sup> such that <sup>K</sup> <sup>⊆</sup> *holes*(A), we use the following shorthands. Given a type <sup>T</sup><sup>k</sup> for each <sup>k</sup> <sup>∈</sup> <sup>K</sup>, we write <sup>A</sup>\*T<sup>k</sup>+<sup>k</sup>∈<sup>K</sup> for the extended context obtained by replacing each hole <sup>k</sup> <sup>∈</sup> <sup>K</sup> in <sup>A</sup> by <sup>T</sup>k. Also, given an extended context <sup>A</sup> we write <sup>A</sup>A <sup>K</sup> for the extended context obtained by replacing each hole <sup>k</sup> <sup>∈</sup> <sup>K</sup> in <sup>A</sup> by <sup>A</sup> . When <sup>K</sup> <sup>=</sup> {k}, we often omit <sup>K</sup> and write, e.g., <sup>A</sup>A <sup>k</sup> and <sup>A</sup>\*T<sup>k</sup>+<sup>k</sup>.

*Example 10.* Using the above notation and posing <sup>A</sup> <sup>=</sup> &{*tc* : []<sup>1</sup>, *done* : []<sup>2</sup>}, we can rewrite T <sup>S</sup> (Figure 2) as A <sup>A</sup>\*T <sup>S</sup>+<sup>1</sup><sup>1</sup>\*μ**t**. ⊕ {*tm* : **<sup>t</sup>**, *over* : **end**}+<sup>2</sup>.

*Example 11.* Consider the session type below

$$S = \& \{ l\_1 : \& \{ l\_1 : T\_1, \ l\_2 : T\_2, \ l\_3 : T\_3 \}, \ l\_2 : \& \{ l\_1 : T\_1, \ l\_2 : T\_2, \ l\_3 : T\_3 \}, \ l\_3 : T\_3 \}. $$

Posing <sup>A</sup> <sup>=</sup> &{l<sup>1</sup> : []<sup>1</sup>, l<sup>2</sup> : []<sup>2</sup>, l<sup>3</sup> : []<sup>3</sup>} we have *holes*(A) = {1, <sup>2</sup>, <sup>3</sup>}. Assuming <sup>J</sup> <sup>=</sup> {1, <sup>2</sup>} and <sup>K</sup> <sup>=</sup> {3}, we can rewrite <sup>S</sup> as <sup>A</sup><sup>A</sup>\*T<sup>j</sup> <sup>+</sup><sup>j</sup>∈<sup>J</sup> <sup>J</sup> \*T<sup>k</sup>+<sup>k</sup>∈<sup>K</sup>.

*Example 12.* Figure <sup>3</sup> shows the partial simulation tree for <sup>T</sup><sup>S</sup> <sup>≤</sup> <sup>T</sup> S, from Figures 1 and 2 (ignore the dashed edges for now). Notice how the branch leading to the top part of the tree visits only finitely many node labels (see dotted box), however the bottom part of the tree generates infinitely many labels, see the path along the !*tm* transitions in the dashed box.

**Fig. 3.** Simulation tree for T<sup>S</sup> ≤ T- <sup>S</sup> (Figures 1 and 2), the root of the tree is in bold.

*Witness subtrees* Next, we define witness trees which are finite subtrees of a simulation tree which we prove to be successful. The role of the witness subtree is to identify branches that satisfy a certain accumulation pattern. It detects an input context <sup>A</sup> whose holes fall in two categories: (i) growing holes (indexed by indices in J below) which lead to an infinite growth and (ii) constant holes (indexed by indices in K below) which stay stable throughout the simulation game. The definition of witness trees relies on the notion of *ancestor* of a node n, which is a node n (different from n) on the path from the root n<sup>0</sup> to n. We illustrate witness trees with Figure 3 and Example 13.

**Definition 13 (Witness Tree).** *A tree* (N, n0, , λ) *is a* witness tree *for* <sup>A</sup>*, such that holes*(A) = <sup>I</sup>*, with* ∅ ⊆ <sup>K</sup> <sup>⊂</sup> <sup>I</sup> *and* <sup>J</sup> <sup>=</sup> <sup>I</sup> \ <sup>K</sup>*, if all the following conditions are satisfied:*

	- **–** *holes*(A ) <sup>⊆</sup> <sup>K</sup> *implies that* <sup>n</sup> *is a leaf and*
	- **–** *if* <sup>λ</sup>(n)=(T, <sup>A</sup>[Si] <sup>i</sup>∈<sup>I</sup> ) *and* n *is not a leaf then* unfold(T) *starts with an output selection;*

*(b)* <sup>λ</sup>(n)=(T, <sup>A</sup><sup>A</sup>\*S<sup>j</sup> <sup>+</sup><sup>j</sup>∈<sup>J</sup> <sup>J</sup> \*S<sup>k</sup>+<sup>k</sup>∈<sup>K</sup>) *and* <sup>n</sup> *has an ancestor* <sup>n</sup> *s.t.* <sup>λ</sup>(n ) = (T, <sup>A</sup>[Si] <sup>i</sup>∈<sup>I</sup> ) *(c)* <sup>λ</sup>(n)=(T, <sup>A</sup>[Si] <sup>i</sup>∈<sup>I</sup> ) *and* n *has an ancestor* n *s.t.* λ(n )=(T, <sup>A</sup><sup>A</sup>\*S<sup>j</sup> <sup>+</sup><sup>j</sup>∈<sup>J</sup> <sup>J</sup> \*S<sup>k</sup>+<sup>k</sup>∈<sup>K</sup>) *(d)* <sup>λ</sup>(n)=(T, <sup>A</sup> [Sk] <sup>k</sup>∈K- ) *where* <sup>K</sup> <sup>⊆</sup> <sup>K</sup> *and for all leaves* (T,S) *of type* (2c) *or* (2d) <sup>T</sup> <sup>≤</sup> <sup>S</sup> *holds.*

Intuitively Condition (1) says that a witness subtree consists of nodes that are labelled by pairs (T,S) where <sup>S</sup> contains a fixed context <sup>A</sup> (or a reduction/repetition thereof) whose holes are partitioned in growing holes (J) and constant holes (K). Whenever all growing holes have been removed from a pair (by reduction of the context) then this means that the pair is labelling a leaf of the tree. In addition, if the initial input is limited to only one instance of A, the l.h.s. type starts with an output selection so that this input cannot be consumed in the subtyping simulation game.

Condition 2 says that all leaves of the tree must validate certain conditions from which we can infer that their continuations in the full simulation tree lead to successful branches. Leaves satisfying Condition (2a) straightforwardly lead to successful branches as the subtyping simulation game, starting from the corresponding pair, has been already checked starting from its ancestor having the same label. Leaves satisfying Condition (2b) lead to an infinite but regular "increase" of the types in J-indexed holes — following the same pattern of accumulation from their ancestor. The next two kinds of leaves must additionally satisfy the subtyping relation — using witness trees inductively or based on the fact they generate finitely many labels. Leaves satisfying Condition (2c) lead to regular "decrease" of the types in J-indexed holes — following the same pattern of reduction from their ancestor. Leaves satisfying Condition (2d) use only constant <sup>K</sup>-indexed holes because, by reduction of the context <sup>A</sup> , the growing holes containing the accumulation A have been removed.

*Remark 1.* Definition 13 is parameterised by an input context A. We explain how such contexts can be identified while building a simulation tree in Section 5.

*Example 13.* In the tree of Figure 3 we highlight two subtrees. The subtree in the dotted box is not a witness subtree because it does not validate Condition (1) of Definition 13, i.e., there is an intermediary node with a label in which the r.h.s type does not contain A.

The subtree in the dashed box is a witness subtree with 3 leaves, where the dashed edges represent the ancestor relation, <sup>A</sup> <sup>=</sup> &{*tc* :[]<sup>1</sup>, *done* :[]<sup>2</sup>}, <sup>J</sup> <sup>=</sup> {1} and <sup>K</sup> <sup>=</sup> {2}. We comment on the leaves clockwise, starting from (**end**, **end**), which satisfies Condition (2d). The next leaf satisfies condition (2c), while the final leaf satisfies Condition (2b).

*Algorithm* Given two session types T and S we first check whether S is uncontrollable. If this is the case we immediately conclude that <sup>T</sup> <sup>≤</sup> <sup>S</sup>. Otherwise, we proceed in four steps.

**S1** We compute a finite fragment of *simtree*(T,S), stopping whenever (i) we encounter a leaf (successful or not), (ii) we encounter a node that has an ancestor as defined in Definition 13 (Conditions (2a), (2b), and (2c)), (iii) or the length of the path from the root of *simtree*(T,S) to the current node exceeds a bound set to two times the depth of the AST of S. This bound allows the algorithm to explore paths that will traverse the super-type at least twice. We have empirically confirmed that it is sufficient for all examples mentioned in Section 5.

**S2** We remove subtrees from the tree produced in **S1** corresponding to successful branches of the simulation game which contain finitely many labels. Concretely, we remove each subtree whose each leaf n is either successful or has an ancestor n such that n is in the same subtree and λ(n) = λ(n ).

**S3** We extract subtrees from the tree produced in **S2** that are potential *candidates* to be subsequently checked. The extraction of these finite candidate subtrees is done by identifying the forest of subtrees rooted in ancestor nodes which do not have ancestors themselves.

**S4** We check that each of the candidate subtrees from **S3** is a witness tree.

If an unsuccessful leaf is found in **S1**, then the considered session types are not related. In **S1**, if the generation of the subtree reached the bound before reaching an ancestor or a leaf, then the algorithm is unable to give a decisive verdict, i.e., the result is *unknown*. Otherwise, if all checks in **S4** succeed then the session types are in the fair asynchronous subtyping relation. In all other cases, the result is *unknown* because a candidate subtree is not a witness.

*Example 14.* We illustrate the algorithm above with the tree in Figure 3. After **S1**, we obtain the whole tree in the figure (11 nodes). After **S2**, all nodes in the dotted boxed are removed. After **S3** we obtain the (unique) candidate subtree contained in the dashed box. This subtree is identified as a witness subtree in **S4**, hence we have <sup>T</sup><sup>S</sup> <sup>≤</sup> <sup>T</sup> S.

We state the main theorem that establishes the soundness of our algorithm, where <sup>∗</sup> is the reflexive and transitive closure of .

**Theorem 5.** *Let* T *and* S *be session types s.t. simtree*(T,S)=(N, n0, , λ)*. If simtree*(T,S) *contains a witness subtree with root* <sup>n</sup> *then for every node* <sup>n</sup> <sup>∈</sup> <sup>N</sup> *s.t.* n <sup>∗</sup> n *, either* n *is a successful leaf, or there exists* n *s.t.* n n*.*

We can conclude that if the candidate subtrees of *simtree*(T,S) identified with the strategy explained above are also witness subtrees, then we have <sup>T</sup> <sup>≤</sup> <sup>S</sup>.

#### **5 Implementation**

To evaluate our algorithm, we have produced a Haskell implementation of it, which is available on GitHub [31]. Our tool takes two session types T and S as input then applies Steps **S1** to **S4** to check whether <sup>T</sup> <sup>≤</sup> <sup>S</sup>. A user-provided bound can be given as an optional argument. We have run our tool on a dozen of examples handcrafted to test the limits of our algorithm (inc. the examples discussed in this paper), as well as on the 174 tests taken from [6]. All of these tests terminate under a second.

For debugging and illustration purposes, the tool can optionally generate graphical representations of the simulation and witness trees, and check whether the given types are controllable. We give examples of these in [9].

Our tool internally uses automata to represent session types and uses strong bisimilarity instead of syntactic equality between session types. Using automata internally helps us identify candidate input contexts as we can keep track of states that correspond to the input context computed when applying Case (4) of Definition 12. In particular, we augment each local state in the automata representation of the candidate supertype with two counters: the c-counter keeps track of how many times a state has been used in an input *c*ontext; the hcounter keeps track of how many times a state has occurred within a *h*ole of an input context. We illustrate this with Figure 4 which illustrates the internal data structures our tool manipulates when checking <sup>T</sup><sup>S</sup> <sup>≤</sup> <sup>T</sup> <sup>S</sup> from Figures 1 and 2. The state indices of the automata in Figure 4 correspond to the ones in Figure 1 (2nd column) and Figure 2 (3rd column).

The first row of Figure 4 represents the root of the simulation tree, where both session types are in their respective initial state and no transition has been executed. We use state labels of the form nc,<sup>h</sup> where n is the original identity of the state, c is the value of the c-counter, and h is the value of the h-counter. The second row depicts the configuration after firing transition !*tm*, via Case (4) of Definition 12. While the candidate subtype remains in state 0 (due to a selfloop) the candidate supertype is unfolded with selUnfold(T <sup>S</sup>) (Definition 10). The resulting automaton contains an additional state and two transitions. All previously existing states have their h-counter incremented, while the new state has its c-counter incremented. The third row of the figure shows the configuration after firing transition !*over* , using Case (4) of Definition 12 again. In this step, another copy of state 0 is added. Its c-counter is set to 2 since this state has been used in a context twice; and the h-counters of all other states are incremented.

Using this representation, we construct a candidate input context by building a tree whose root is a state qc,<sup>h</sup> such that c > 1. The nodes of the tree are taken from the states reachable from qc,h, stopping when a state q c-,h such that c < c is found. A leaf q c-,h becomes a hole of the input context. The hole is a constant (K) hole when h = c, and growing (J) otherwise. Given this strategy and the configurations in Figure 4, we successfully identify the context <sup>A</sup> <sup>=</sup> &{*tc* :[]<sup>1</sup>, *done* :[]<sup>2</sup>} with <sup>J</sup> <sup>=</sup> {1} and <sup>K</sup> <sup>=</sup> {2}.

#### **6 Related and Future Work**

*Related work* We first compare with previous work on refinement for asynchronous communication by some of the authors of this paper. The work in [10] also considers fair compliance, however here we consider binary (instead of multiparty) communication and we use a unique input queue for all incoming messages instead of distinct named input channels. Moreover, here we provide a


**Fig. 4.** Internal representation of the simulation tree for T<sup>S</sup> ≤ T- <sup>S</sup> (fragment).

sound characterisation of fair refinement using coinductive subtyping and provide a sound algorithm and its implementation. In [13] the asynchronous subtyping of [7, 14, 15, 26] is used to characterise refinement for a notion of correct composition based on the impossibility to reach a deadlock, instead of the possibility to reach a final successful configuration as done in the present paper. The refinement from [13] does not support examples such as those in Figure 1.

Concerning previous notions of synchronous subtyping, Gay and Hole [17,18] first introduced the notion of subtyping for *synchronous* session types, which is decidable in quadratic time [22]. This subtyping only supports covariance of outputs and contravariance of inputs, but does not address anticipation of outputs. Padovani studied a notion of fair subtyping for *synchronous* multi-party session types in [29]. This work notably considers the notion of *viability* which corresponds, in the synchronous multiparty setting, to our notion of controllability. We use the term controllability instead of viability following the tradition of service contract theories like those based on Petri nets [25, 33] or process calculi [12]. In contrast to [29], asynchronous communication makes it much more involved to characterise controllability in a decidable way, as we do in this paper. Fair refinement in [29] is characterised by defining a coinductive relation on normal form of types, obtained by removing inputs leading to uncontrollable continuations. Instead of using normal forms, we remove these inputs during the asynchronous subtyping check. A limited form of variance on output is also admitted in [29]. Covariance between the outputs of a subtype and those of a supertype is possible when the additional branches in the supertype are not needed to have compliance with potential partners. In [29] this check is made possible by exploiting a *difference* operation [29, Definition 3.15] on types, which synthesises a new type representing branches of one type that are absent in the other. We observe that the same approach cannot work to introduce variance on outputs in an asynchronous setting. Indeed the interplay between output anticipation and recursion could generate differences in the branches of a subtype and a supertype that cannot be statically represented by a (finite) session type.

Padovani also studied an alternative notion of fair *synchronous* subtyping in [28]. Although the contribution of that paper refers to session types, the formal framework therein seems to deviate from the usual session type approach. In particular, it considers shared channel communication instead of binary channels: when a partner emits a message, it is possible to have a race among several potential receivers for consuming it. As a consequence of this alternative semantics, the subtyping in [28] does not admit variance on input. Another difference with respect to session type literature is the notion of *success* among interacting sessions: a composition of session is successful if at least one participant reaches an internal successful state. This approach has commonalities with testing [27], where only the test composed with the system under test is expected to succeed, but differs from the typical notion of success considered for session types. In [2,3] (resp. [14]) it was proved that the Gay-Hole synchronous session subtyping (resp. orphan message free asynchronous subtyping) coincides with refinement induced by a successful termination notion requiring interacting processes to be *both* in the **end** state (with empty buffers, in the asynchronous case).

Several variants of asynchronous session subtyping have been proposed in [14, 15, 26] and further studied in our earlier work [6, 7, 13]. All these variants have been shown to be undecidable [7, 8, 23]. Moreover, all these subtyping relations are (implicitly) based on an unfair notion of compliance. Concretely, the definition of asynchronous subtyping introduced in this paper differs from the one in [14,15] since no additional constraint guaranteeing absence of orphan-messages is considered. Such a constraint requires the subtype not to have output loops whenever an output anticipation is performed, thus guaranteeing that at least one input is performed in all possible paths. In this paper, absence of orphan messages is guaranteed by enforcing types to (fairly) reach a successful termination. Moreover, our novel subtyping differs from those in [14, 15, 26] since we use recursive input contexts (and not just finite ones) for the first time — this is necessary to obtain T <sup>G</sup> <sup>≤</sup> <sup>T</sup><sup>G</sup> and <sup>T</sup><sup>S</sup> <sup>≤</sup> <sup>T</sup> <sup>S</sup> (see Figures 1 and 2). Notice that not imposing the above mentioned orphan-message-free constraint of [14, 15] is consistent with recursive input contexts that allows for input loops in the supertype whenever an output anticipation is performed. In [6], we proposed a sound algorithm for the asynchronous subtyping in [14]. The sound algorithm that we present in this paper substantially differs from that of [6]. Here we use witness trees that take under consideration both increasing and decreasing of accumulated input. In [6], instead, only regular growing accumulation is considered.

*Future work* In future work, we will investigate how to support output variance in fair asynchronous subtyping. We also plan to study fairness in the context of asynchronous multiparty session types, as fair compliance and refinement extend naturally to several partners. Finally, we will investigate a more refined termination condition for our algorithm using ideas from [6, Definition 11].

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

#### Running Time Analysis of Broadcast Consensus Protocols*- --*

Philipp Czerner<sup>1</sup> [-] and Stefan Jaax<sup>1</sup>

Fakultät für Informatik, Technische Universität München, Garching bei München, Germany {czerner,jaax}@in.tum.de

Abstract. Broadcast consensus protocols (BCPs) are a model of computation, in which anonymous, identical, finite-state agents compute by sending/receiving global broadcasts. BCPs are known to compute all number predicates in NL = NSPACE(log n) where n is the number of agents. They can be considered an extension of the well-established model of population protocols. This paper investigates execution time characteristics of BCPs. We show that every predicate computable by population protocols is computable by a BCP with expected O(n log n) interactions, which is asymptotically optimal. We further show that every log-space, randomized Turing machine can be simulated by a BCP with O(n log n·T) interactions in expectation, where T is the expected runtime of the Turing machine. This allows us to characterise polynomial-time BCPs as computing exactly the number predicates in ZPL, i.e. predicates decidable by log-space, randomised Turing machine with zero-error in expected polynomial time where the input is encoded as unary.

Keywords: broadcast protocols · complexity theory · distributed computing

### 1 Introduction

In recent years, models of distributed computation following the *computation-byconsensus* paradigm attracted considerable interest in research (see for example [9,25,26,8,13]). In such models, network agents compute number predicates, i.e. Boolean-valued functions of the type <sup>N</sup><sup>k</sup> → {0, <sup>1</sup>}, by reaching a stable consensus whose value determines the outcome of the computation. Perhaps the most prominent model following this paradigm are *population protocols* [5,6], a model in which anonymous, identical, finite-state agents interact randomly in pairwise rendezvous to agree on a common Boolean output.

Due to anonymity and locality of interactions, it is an inherent property of population protocols that agents are generally unable to detect with absolute

<sup>-</sup> This work was supported by an ERC Advanced Grant (787367: PaVeS) and by the Research Training Network of the Deutsche Forschungsgemeinschaft (DFG) (378803395: ConVeY).

<sup>-</sup>-The full version of this paper can be found at https://arxiv.org/abs/2101.03780 .

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 164–183, 2021. https://doi.org/10.1007/978-3-030-71995-1\_9

certainty when the computation has stabilized. This makes sequential composition of protocols difficult, and further complicates the implementation of control structures such as loops or branching statements. To overcome this drawback, two kinds of approaches have been suggested in the literature: 1.) Let agents guess when the computation has stabilized, leading to composable, but merely *approximately correct* protocols [7,24], or 2.) extend population protocols by global communication primitives that enable agents to query global properties of the agent population [13,8,26].

Approaches of the first kind are for the most part based on simulations of global broadcasts by means of *epidemics*. In epidemics-based approaches the spread of the broadcast signal is simulated by random pairwise rendezvous, akin to the spread of a viral epidemic in a population. When the broadcasting agent meets a certain fraction of "infected" agents, it may decide with reasonable certainty that the broadcast has propagated throughout the entire population, which then leads to the initiation of the next computation phase. Of course, the decision to start the next phase may be premature, in which case the rest of the execution may be faulty. However, epidemics can also be used to implement phase clocks that help keep the failure probability low (see e.g. [7]).

In [13], Blondin, Esparza, and one of the authors of this paper introduced *broadcast consensus protocols* (BCPs), an extension of population protocols by reliable, global, and atomic broadcasts. BCPs find their precursor in the broadcast protocol model introduced by Emerson and Namjoshi in [17] to describe bus-based hardware protocols. This model has been investigated intensely in the literature, see e.g. [18,19,15,28]. Broadcasts also arise naturally in biological systems. For example, Uhlendorf *et al.* analyse applications of broadcasts in the form of an external, global light source for controlling a population of yeasts [12].

The authors of [13] show that BCPs compute precisely the predicates in NL = NSPACE(log n), where n is the number of agents. For comparison, it is known that population protocols compute precisely the *Presburger predicates*, which are the predicates definable in the first-order theory of the integers with addition and the usual order; a class much less expressive than the former.

An epidemics-based approach was used in [7] to show that population protocols can simulate with high probability a step of a virtual register machine with expected <sup>O</sup>(<sup>n</sup> log<sup>5</sup>(n)) interactions, where <sup>n</sup> is the number of agents. This result stimulated further research into time bounds for classical problems such as leader election (see e.g. [21,1,16,29,11]) and majority (see e.g. [4,2]). In their seminal paper [5], Angluin *et al.* already showed that population protocols can stably compute Presburger predicates with <sup>O</sup>(n<sup>2</sup> log <sup>n</sup>) interactions in expectation. Belleville *et al.* further showed that leaderless protocols require a quadratic number of interactions in expectation to stabilize to the correct output for a wide class of predicates [10]. The aforementioned bounds apply to *stabilisation time*: the time it takes to go from an initial configuration to a stable consensus that cannot be destroyed by future interactions. In [24], Kosowski and Uznanski considered the weaker notion of *convergence time*: the time it takes on average to ultimately transition to the correct consensus (although this consensus could

in principle be destroyed by future interactions), and they show that sublinear convergence time is achievable.

By contrast, to the best of our knowledge, time characteristics of BCPs have not been discussed in the literature. The NL-powerful result presented in [13] does not establish any time bounds. In fact, [13] only considers a non-probabilistic variant of BCPs with a global fairness assumption instead of probabilistic choices.

Contributions of the paper. This paper initiates the runtime analysis of BCPs in terms of expected number of interactions to reach a stable consensus. To simplify the definition of probabilistic execution semantics, we introduce a restricted, deterministic variant of BCPs without rendezvous transitions. In Section 2, we define probabilistic execution semantics for the restricted version of BCPs, and we provide an introductory example for a fast protocol computing majority in Section 3.

In Section 4, we show that these restrictions of our BCP model are inconsequential in terms of expected number of interactions: both rendezvous and nondeterministic choices can be simulated with a constant runtime overhead.

In Section 5, we show that every Presburger predicate can be computed by BCPs with O(n log n) interactions and with constant space, where n denotes the number of agents in the population. This result is asymptotically optimal.

In more generality, in Section 6, we use BCPs to simulate Turing machines (TMs). In particular, we show that any randomised, logarithmically space-bound, polynomial-time TM can be simulated by a BCP with an overhead of O(n log n) interactions per step. Conversely, any polynomial-time BCP can be simulated by such a TM. This result can be considered an improvement of the NL bound from [13], now in a probabilistic setting. We also give a corresponding upper bound, which yields the following succinct characterisation: polynomial-time BCPs compute exactly the number predicates in ZPL, which are the languages decidable by randomised log-space polynomial-time TMs with zero-error (the log-space analogue to ZPP).

Bounding the time requires a careful analysis of each step in the simulation of the Turing machine. Thus, our proof diverges in significant ways from the proof establishing the NL lower bound in [13]. Most notably, we now make use of epidemics in order to implement clocks that help reduce failure rates.

### 2 Preliminaries

Complexity classes. As is usual, we define NL as the class of languages decidable by a nondeterministic log-space TM. Additionally, by ZPL we denote the set of languages decided by a randomised log-space TM A, s.t. A only terminates with the correct result (zero-error) and that it terminates within O(poly n) steps in expectation, as defined by Nisan in [27].

Multisets. <sup>A</sup> *multiset* over a finite set <sup>E</sup> is a mapping <sup>M</sup> : <sup>E</sup> <sup>→</sup> <sup>N</sup>. The set of all multisets over <sup>E</sup> is denoted <sup>N</sup><sup>E</sup>. For every <sup>e</sup> <sup>∈</sup> <sup>E</sup>, <sup>M</sup>(e) denotes the number of occurrences of e in M. We sometimes denote multisets using a setlike notation, e.g. f, g, g is the multiset <sup>M</sup> such that <sup>M</sup>(f)=1, <sup>M</sup>(g)=2 and M(e)=0 for every e ∈ E\{f,g}. Addition, comparison and scalar multiplication are extended to multisets componentwise, i.e. (M + M )(e) def = M(e) + M (e), (λM)(e) def <sup>=</sup> λM(e) and <sup>M</sup> <sup>≤</sup> <sup>M</sup> def ⇐⇒ M(e) ≤ M (e) for every M,M <sup>∈</sup> <sup>N</sup><sup>Q</sup>, <sup>e</sup> <sup>∈</sup> <sup>E</sup>, and <sup>λ</sup> <sup>∈</sup> <sup>N</sup>. For <sup>M</sup> <sup>≤</sup> <sup>M</sup> we also define componentwise subtraction, i.e. (M −M )(e) def = M(e)−M (e) for every <sup>e</sup> <sup>∈</sup> <sup>E</sup>. For every <sup>e</sup> <sup>∈</sup> <sup>E</sup>, we write *<sup>e</sup>* def = e. We lift functions f : E → E to multisets by defining f(M)(e ) def = - f(e)=e- M(e) for e ∈ E . Finally, we define the *support* and *size* of <sup>M</sup> <sup>∈</sup> <sup>N</sup><sup>E</sup> respectively as -<sup>M</sup> def = {e ∈ E : M(e) > 0} and |M| def = - <sup>e</sup>∈<sup>E</sup> <sup>M</sup>(e).

Broadcast Consensus Protocols. A *broadcast consensus protocol* [13] (BCP) is a tuple P = (Q, Σ, δ, I, O) where


The function δ maps every state q ∈ Q to a pair (r, f) consisting of the *successor state* r ∈ Q and the *response function* f : Q −→ Q.

Configurations. <sup>A</sup> *configuration* is a multiset <sup>C</sup> <sup>∈</sup> <sup>N</sup><sup>Q</sup>. Intuitively, a configuration C describes a collection of identical finite-state *agents* with Q as set of states, containing <sup>C</sup>(q) agents in state <sup>q</sup> for every <sup>q</sup> <sup>∈</sup> <sup>Q</sup>. We say that <sup>C</sup> <sup>∈</sup> <sup>N</sup><sup>Q</sup> is a <sup>1</sup>*-consensus* if -<sup>C</sup> <sup>⊆</sup> <sup>O</sup>, and a <sup>0</sup>*-consensus* if -<sup>C</sup> <sup>⊆</sup> <sup>Q</sup> \ <sup>O</sup>.

Step relation. A broadcast δ(q)=(r, f) is executed in three steps: (1) an agent at state q broadcasts a signal and leaves q; (2) all other agents receive the signal and move to the states indicated by the function f, i.e. an agent in state s moves to f(s); and (3) the broadcasting agent enters state r.

Formally, for two configurations C, C we write C −→ C , whenever there exists a state q ∈ Q s.t. C(q) ≥ 1, δ(q)=(r, f), and C = f(C − *q*) + *r* is the configuration computed from C by the above three steps. By <sup>∗</sup> −→ we denote the reflexive-transitive closure of −→.

For example, consider a configuration C def = a, a, b and a broadcast transition a → b, {a → c, b → d}. To execute this transition, we move an agent from state a to state b and apply the transition function to all other agents, so we end up in C def = <sup>b</sup> <sup>+</sup> c, d.

Broadcast transitions. We write broadcast transitions as q → r, S with S a set of expressions q → r . This refers to δ(q)=(r, f), with f(q ) = r for (q → r ) ∈ S. We usually omit identity mappings q → q when specifying S.

For graphic representations of broadcast protocols we use a different notation, which separates sending and receiving broadcasts. There we identify a transition δ(q)=(r, f) with a name α and specify it by writing q !<sup>α</sup> −→ <sup>r</sup> and <sup>q</sup> ?<sup>α</sup>−→ <sup>r</sup> for f(q ) = r . Intuitively, <sup>q</sup> ?<sup>α</sup>−→ <sup>r</sup> can be understood as an agent transitioning from q to r upon receiving the signal α, and q !<sup>α</sup> −→ <sup>r</sup> means that an agent in state q may transmit the signal α and simultaneously transition to state r.

As defined, δ is a total function, so each state is associated with a unique broadcast. If we do not specify a transition δ(q)=(r, f) explicitly, we assume that it simply maps each state to itself, i.e. q → q, {r → r : r ∈ Q}. We refer to those transitions as *silent*.

Executions. An *execution* is an infinite sequence π = C0C1C2... of configurations with C<sup>i</sup> −→ C<sup>i</sup>+1 for every i. It has some fixed number of agents n def <sup>=</sup> <sup>|</sup>C0<sup>|</sup> <sup>=</sup> <sup>|</sup>C1<sup>|</sup> <sup>=</sup> ... . Given a BCP and an initial configuration <sup>C</sup><sup>0</sup> <sup>∈</sup> <sup>N</sup><sup>Q</sup>, we generate a random execution with the following Markov chain: to perform a step at configuration Ci, a state q ∈ Q is picked at random with probability distribution p(q) = Ci(q)/|C<sup>i</sup>|, and the (uniquely defined) transition δ(q) is executed, giving the successor configuration C<sup>i</sup>+1. We refer to the random variable corresponding to the trace of this Markov chain as *random execution*.

Stable Computation. Let π denote an execution and inf(π) the configurations occurring infinitely often in π. If inf(π) contains only b-consensuses, we say that <sup>π</sup> *stabilises* to <sup>b</sup>. For a predicate <sup>ϕ</sup> : <sup>N</sup><sup>Σ</sup> → {0, <sup>1</sup>} we say that <sup>P</sup> *(stably) computes* <sup>ϕ</sup>, if for all inputs <sup>X</sup> <sup>∈</sup> <sup>N</sup><sup>Σ</sup>, the random execution of <sup>P</sup> with initial configuration C<sup>0</sup> = I(X) stabilises to ϕ(X) with probability 1.

Finally, for an execution π = C0C1C2... we let T<sup>π</sup> denote the smallest i s.t. all configurations in CiC<sup>i</sup>+1... are ϕ(X)-consensuses, or ∞ if no such i exists. We say that a BCP P *computes* ϕ *within* f(n) *interactions*, if for all initial configurations <sup>C</sup><sup>0</sup> with <sup>n</sup> agents the random execution <sup>π</sup> starting at <sup>C</sup><sup>0</sup> has <sup>E</sup>(Tπ) <sup>≤</sup> <sup>f</sup>(n) <sup>&</sup>lt; <sup>∞</sup>, i.e. P stabilises within f(n) steps in expectation. If f ∈ O(poly(n)), then we call P a *polynomial-time* BCP.

Global States. Often, it is convenient to have a shared global state between all agents. If, for a BCP P = (Q, Σ, δ, I, O) we have Q = S ×G, I(Σ) ⊆ Q× {j} for some j ∈ G, and f((s, j)) ∈ Q × {j } for each δ((q, j)) = ((r, j ), f), then we say that <sup>P</sup> has *global states* <sup>G</sup>. A configuration <sup>C</sup> has *global state* <sup>j</sup>, if -<sup>C</sup> <sup>⊆</sup> <sup>Q</sup>× {j} for j ∈ G. Note that, starting from a configuration with global state j, P can only reach configurations with a global state. Hence for P we will generally only consider configurations with a global state. To make our notation more concise, when specifying a transition δ(q)=(r, f) for P, we will write f as a mapping from S to S, as q, r already determine the mapping of global states.

Population Protocols. A population protocol [5] replaces broadcasts by local rendezvous. It can be specified as a tuple (Q, Σ, δ, I, O) where Q, Σ, I, O are defined as in BCPs, and <sup>δ</sup> : <sup>Q</sup><sup>2</sup> <sup>→</sup> <sup>Q</sup><sup>2</sup> defines *rendezvous transitions*. A step of the protocol at C is made by picking two agents uniformly at random, and applying δ to their states: first q<sup>1</sup> ∈ Q is picked with probability C(q1)/|C|, then q<sup>2</sup> ∈ Q is picked with probability C (q2)/|C <sup>|</sup>, where <sup>C</sup> def = C − <sup>q</sup><sup>1</sup>. The successor configuration then is C − <sup>q</sup>1, q<sup>2</sup> <sup>+</sup> <sup>r</sup>1, r<sup>2</sup> where <sup>δ</sup>(q1, q2)=(r1, r2).

Broadcast Protocols. Later on we will construct BCPs out of smaller building blocks which we call *broadcast protocols (BPs)*. A BP is a pair (Q, δ), where Q and δ are defined as for BCPs. We extend the applicable definitions from above to BPs, in particular the notions of configurations, executions, and global states.

#### 3 Example: Majority

Fig. 1. A fast broadcast consensus protocol computing the majority predicate.

As an introductory example, we construct a broadcast consensus protocol for the *majority predicate* ϕ(x, y) = x>y. Figure 1 depicts the protocol graphically. We have the set of states {x, y, ,} × {0, 1}, with global states {0, 1}, where the states O def = {(x, 1),(y, 1),(,, 1)} are accepting, and I(x)=(x, 0) and I(y) = (y, 0). The transitions are

$$(x,0)\mapsto(\diamondsuit,1),\emptyset\tag{\alpha}$$

$$(y,1) \mapsto (\diamond,0), \Downarrow$$

$$(\beta)$$

Note that we use the more compact notation for transitions in the presence of global states, written in long form (α) would be

$$\{(x,0)\mapsto(\diamondsuit,1),\{(x,0)\mapsto(x,1),(y,0)\mapsto(y,1),(\diamondsuit,0)\mapsto(\diamondsuit,1)\}\tag{\alpha}$$

To make the presentation of the following sample execution more readable, we shorten the state (i, j) to i<sup>j</sup> . For input x = 3 and y = 2, an execution could look like this:

$$\begin{aligned} \left\{x\_0, x\_0, x\_0, y\_0, y\_0\right\} &\xrightarrow{\alpha} \left\{\diamond\_1, x\_1, x\_1, y\_1, y\_1\right\} \xrightarrow{\beta} \left\{\diamond\_0, x\_0, x\_0, \diamond\_0, y\_0\right\} \\ \xrightarrow{\alpha} \left\{\diamond\_1, \diamond\_1, x\_1, \diamond\_1, y\_1\right\} &\xrightarrow{\beta} \left\{\diamond\_0, \diamond\_0, x\_0, \diamond\_0, \diamond\_0\right\} \xrightarrow{\alpha} \left\{\diamond\_1, \diamond\_1, \diamond\_1, \diamond\_1, \diamond\_1\right\} \end{aligned}$$

Intuitively, there is a preliminary global consensus, which is stored in the global state. Initially, it is rejecting, as x>y is false in the case x = y = 0. However, any x agent is enough to tip the balance, moving to an accepting global state. Now any y agent could speak up, flipping the consensus again.

The two factions initially belonging to x and y, respectively, alternate in this manner by sending signals α and β. Strict alternation is ensured as an agent will not broadcast to confirm the global consensus, only to change it.

After emitting the signal, the agent from the corresponding faction goes into state ,, where it can no longer influence the computation. In the end, the majority faction remains and determines the final consensus.

Considering these alternations with shrinking factions, the expected number of steps of the protocol until stabilization can be bounded by 2 n <sup>k</sup>=1 n/k = O(n log n). To see that this holds, we consider the factions separately: let n<sup>0</sup> denote the number of agents the first faction starts with (i.e. agents initially in state (x, 0)), and n<sup>1</sup> the number at the end. When we are waiting for the first transition of this faction all n<sup>0</sup> agents are enabled, so we wait n/n<sup>0</sup> steps in expectation until one of them executes a broadcast. For the next one, we wait n/(n<sup>0</sup> − 1) steps. In total, this yields n0 <sup>k</sup>=n1+1 n/k ≤ n <sup>k</sup>=1 n/k steps for the first faction, and via the same analysis for the second as well.

In contrast to the O(n log n) interactions this protocol takes, constant-state population protocols require n<sup>2</sup> interactions in expectation for the computation of majority [4]. However, these numbers are not directly comparable: broadcasts may not be parallelizable, while it is uncontroversial to assume that n rendezvous occur in parallel time 1.

### 4 Comparison with other Models

To facilitate the definition of an execution model, we only consider deterministic BCPs, in the sense that for each state there is a unique transition to execute. Blondin, Esparza and Jaax [14] analysed a more general model, i.e. they allow multiple transitions for a single state, picking one of them uniformly at random when an agent in that state sends a broadcast. Additionally, as they consider BCPs as an extension of population protocols, they include rendezvous transitions. We now show that we can simulate both extensions within a constant-factor overhead.

#### 4.1 Non-Deterministic Broadcast Protocols

The following construction allows for two broadcast transitions to be executed uniformly at random from a single state. This can easily be extended to any constant number of transitions using the usual construction of a binary tree with rejection sampling.

Now assume that we are given a BCP (Q, Σ, δ0,I,F) with another set of broadcast transitions δ<sup>1</sup> and we want each agent to pick one transition uniformly at random from δ<sup>0</sup> or δ<sup>1</sup> whenever it executes a broadcast.

We implement this using a synthetic coin, i.e. we are utilising randomness provided by the scheduler to enable individual agents to make random choices. This idea has also been used for population protocols [1,3]. Compared to these implementations, broadcasts allow for a simpler approach.

The idea is that we partition the agents into types, so that half of the agents have type 0 and the other half have type 1. Additionally, there is a global coin shared across all agents. To flip the coin, a random agent announces its type (the coin is set to heads if the agent is type 0, tails if it is type 1) and a second random agent executes a broadcast transition from either δ<sup>0</sup> or δ1, depending on the state of the global coin that has just been set. These two steps repeat, the former flipping the coin fairly and the latter then executing the actual transitions. Figure 2 sketches this procedure.

Fig. 2. Transition diagram for implementing multiple broadcasts per state, for q ∈ Q, with (q, i, j) written as q<sup>i</sup> <sup>j</sup> . Dashed nodes represent multiple states, with j ∈ T. Transitions resulting from executing the broadcasts in δ0, δ<sup>1</sup> are not shown.

Intuitively, we start with no agents having either type 0 or 1. When such a typeless agent is picked by the scheduler to announce its type (to flip the global coin) it instead broadcasts that it is searching for a partner. Once this has happened twice, these two agents are matched, one is assigned type 0 and the other type 1. Thus we ensure that there is the exact same number of type 0 and type 1 agents at all times, meaning that we get a perfectly fair coin. Additionally we make progress regardless of whether an agent with or without a type is chosen.

To describe the construction formally, we introduce a set of types T def = {?, <sup>+</sup>, <sup>−</sup>, <sup>0</sup>, <sup>1</sup>}, and choose the set of states <sup>Q</sup> def = Q × T × {∗, 0, 1}, with global states {∗, 0, 1} used to represent the state of the synthetic coin. We use (q, ?) as initial state instead of q ∈ I, and start with global state ∗. To pick types, we need transitions

$$\{(q,?,\*) \mapsto (q,++,\*), \{(r,?) \mapsto (r,-) : r \in Q\} \qquad \text{for } q \in Q \tag{seek}$$

$$\begin{array}{c} (q, -, \*) \mapsto (q, 1, \*), \{ (r, -) \mapsto (r, ?) : r \in Q \} \\ \qquad \cup \{ (r, +) \mapsto (r, 0) : r \in Q \} \end{array} \quad \text{for } q \in Q \tag{\text{find}}$$

So an agent of type ? announces that it seeks a partner, moving itself to type + and the others to type −. Then any type − agent may broadcast that a match has been found, moving itself to type 1 and the type + agent to type 0. The other type − agents revert to type ?. This ensures that the number of type 0 and 1 agents is always equal. Note that there may be an odd number of agents, in which case one agent of type + remains.

The following transitions effectively flip the global coin, by having an agent of type 0 or 1 announce that we now execute a broadcast transition from respectively δ<sup>0</sup> or δ1. Here, we have q ∈ Q, ◦∈{0, 1}.

$$(q, \circ, \*) \mapsto (q, \circ, \circ), \; \emptyset \tag{flip \; \circ})$$

Then we actually execute the transition δ◦(q)=(r, f), for each (q, i) ∈ Q × T.

$$(q, i, \circ) \mapsto (r, i, \*), \{(s, j) \mapsto (f(s), j) : (s, j) \in Q \times T\} \tag{\text{exec } \circ}$$

As the number of type 0 and 1 agents is equal, we select transitions from δ<sup>0</sup> and δ<sup>1</sup> uniformly at random. It remains to show that the overhead of this scheme is bounded.

Executing transition (exec 0) or (exec 1) is the goal. Transitions (flip 0) and (flip 1) ensure that the former are executed in the very next step, so they cause at most a constant-factor slowdown. Transitions (seek) and (find) can be executed at most n times, as they decrease the number of agent of type ?. All that remains is the implicit silent transition of states (q, +, j), which occurs with probability at most 1/n in each step.

Hence, to execute m ≥ n steps of the simulated protocol our construction takes at most (2m + 2n) · n/(n − 1) ≤ 8m steps in expectation.

#### 4.2 Population Protocols

Another extension to BCPs is the addition of rendez-vous transitions. Here we are given a map <sup>R</sup> : <sup>Q</sup><sup>2</sup> <sup>→</sup> <sup>Q</sup><sup>2</sup>. At each step, we flip a coin and either execute a broadcast transition as usual, or pick two distinct agents uniformly at random, in state q and r, respectively. These interact and move to the two states R(q, r).

Again, we can simulate this extension with only a constant-factor increase in the expected number of steps. Given a BCP (Q, Σ, B, I, F), the idea is to add states {q˜ : q ∈ Q}∪{r<sup>q</sup> : r, q ∈ Q} and insert "activating" transitions q → q, ˜ {r → r<sup>q</sup> : r ∈ Q} for q ∈ Q and "deactivating" transitions r<sup>q</sup> → s, {q˜ → t}∪{u<sup>q</sup> → u : u ∈ Q} for each R(q, r)=(s, t). So a state q first signals that it wants to start a rendez-vous transition. Then, any other state r answers, both executing the transition and signalling to all other states that it has occurred.

Each state in Q has exactly 2 broadcast transitions, so (using the scheme described above) the probability of executing any "activating" transition is exactly 1 <sup>2</sup> , the same as doing one of the original broadcast transitions in B. After doing an activating transition we may do nothing for a few steps by executing the broadcast transition on q˜, but eventually we execute a "deactivating" transition and go back. The probability of executing a broadcast on q˜ is 1/n, so simulating a single rendez-vous transition takes 1 + n/(n − 1) ≤ 3 steps in expectation.

#### 5 Protocols for Presburger Arithmetic

While Blondin, Esparza and Jaax [14] show that BCPs are more expressive than population protocols, they leave the question open whether BCPs provide a runtime speed-up for the class of Presburger predicates computable by population protocols. We already saw that Majority can be computed within O(n log n) interactions in BCPs. This also holds in general for Presburger predicates:

Theorem 1. *Every Presburger predicate is computable by a BCP within at most* O(n log n) *interactions.*

We remark that the O(n log n) bound is asymptotically optimal: e.g. the stable consensus for the parity predicate (x = 1 mod 2) must alternate with configuration size, which clearly requires every agent to perform at least one broadcast in the computation, and thus yields a lower bound of n <sup>k</sup>=1 <sup>n</sup> <sup>k</sup> = Ω(n log n) steps like in the coupon collector's problem [20].

It is known [22] that every Presburger predicate can be expressed as Boolean combination of linear inequalities and linear congruence equations over the integers, i.e. as Boolean combination of predicates of the form - - <sup>i</sup> αix<sup>i</sup> < c, and <sup>i</sup> αix<sup>i</sup> = c mod m, where the αi, c and m are integer constants. In Section 5.1 we construct BCPs that compute arbitrary linear inequalities, before we sketch the construction for congruences and Boolean combinations in Section 5.2.

#### 5.1 Linear Inequalities

Proposition 1. *Let* <sup>α</sup>1,...,αk, c <sup>∈</sup> <sup>Z</sup> *and let* <sup>ϕ</sup>(x1,...,xk) def ⇐⇒ k <sup>i</sup>=1 αix<sup>i</sup> < c *denote a linear inequality. There exists a broadcast consensus protocol that computes* ϕ *within* O(n log n) *interactions in expectation.*

*Proof.* We assume wlog that α<sup>i</sup> = 0 for i = 1, ..., k and that α1, ..., α<sup>k</sup> are pairwise distinct. Let A def = max{|α1|, |α2| ..., |α<sup>k</sup>|, |c|}. We define a BCP P = (Q × G, Σ, δ, I, O) with global states G, where

$$\begin{aligned} Q & \stackrel{\text{def}}{=} \{0, \alpha\_1, \dots, \alpha\_k\} & \Sigma & \stackrel{\text{def}}{=} \{x\_1, \dots, x\_k\} \\ G & \stackrel{\text{def}}{=} [-2A, 2A] & \qquad \qquad \qquad \qquad O \stackrel{\text{def}}{=} \{(q, v) : v < c\} \end{aligned}$$

As inputs we get I(xi) def = (αi, 0) for each i = 1, ..., k. The transitions δ are constructed as follows. For every v ∈ [−2A, 2A] and every α<sup>i</sup> satisfying v + α<sup>i</sup> ∈ [−2A, 2A], we add the following transition to T:

$$(\alpha\_i, v) \mapsto (0, v + \alpha\_i), \emptyset \tag{\alpha\_i}$$

Intuitively, in the first component of its state an agent stores its contribution to - <sup>i</sup> αixi, the left-hand side of the inequality. The global state is used to store a counter value, initially set to 0. Each agent adds its contribution to the counter, as long as it does not overflow. The counter goes from −2A to 2A, which allows it to store the threshold plus any single contribution. The final counter value then determines the outcome of the computation.

Correctness. Let ctr(C) denote the global state (and thus current counter value) of configuration C. Further, let

$$\mathfrak{sum}(C) \stackrel{\text{def}}{=} \sum\_{(\alpha, v) \in Q} C(\alpha, v) \cdot \alpha + \mathfrak{ctr}(C)$$

denote the sum of all agents' contributions and the current value of the counter. Every initial configuration C<sup>0</sup> has ctr(C)=0 and thus sum(C) = - <sup>i</sup> αxi. Each transition α increases the counter by α but sets the agent's contribution to 0 (from α), so sum(C) is constant throughout the execution.

Recall that our output mapping depends only on the value of the counter, so our agents always form a consensus (though not necessarily a stable one). If this consensus and ϕ(C0) disagree, then, we claim, a non-silent transition is enabled.

To see this, note that the current consensus depends on whether ctr(C) < c. If that is the case, but ϕ(C0)=0, then sum(C) ≥ c and some agent with positive contribution α > 0 exists. Due to ctr(C) < c, transition α is enabled. Conversely, if ctr(C) ≥ c and ϕ(C0)=1, some transition α with α < 0 will be enabled.

Finally, note that each non-silent transition increases the number of agents with contribution 0 by one, so at most n can be executed in total. So the execution converges and reaches, by the above argument, a correct consensus.

Convergence time. Each agent executes at most one non-silent transition. To estimate the total number of steps, we partition the agents by their current contribution: for a configuration C let C<sup>+</sup> def <sup>=</sup> <sup>C</sup> {(q, v) <sup>∈</sup> <sup>Q</sup> : q > <sup>0</sup>} denote the agents with positive contribution, and define C<sup>−</sup> analogously. We have that either ctr(C) < 0 and all transitions of agents in C<sup>+</sup> would be enabled, or ctr(C) ≥ 0 and the transitions of C<sup>−</sup> could be executed.

If <sup>C</sup><sup>+</sup> is enabled, then we have to wait at most n/|C<sup>+</sup><sup>|</sup> steps in expectation until a transition is executed, which reduces <sup>|</sup>C<sup>+</sup><sup>|</sup> by one. In total we get n/|C<sup>+</sup> <sup>0</sup> |+ n/(|C<sup>+</sup> <sup>0</sup> |−1)+...+n/1 ∈ O(n log n). The same holds for C−, yielding our overall bound of O(n log n).

#### 5.2 Modulo Predicates and Boolean Combinations

Proposition 2. *Let* ϕ(x1, ..., xk) def ⇐⇒ k <sup>i</sup>=1 αix<sup>i</sup> ≡ c (mod l) < c *denote a linear inequality, with* <sup>α</sup>1,...,αk, c, l <sup>∈</sup> <sup>Z</sup>, l <sup>≥</sup> <sup>2</sup>*. There exists a broadcast consensus protocol that computes* ϕ *within* O(n log n) *interactions in expectation.*

*Proof (sketch).* The idea is the same as for Proposition 1, but instead of taking care not to overflow the counter we simply perform the additions modulo l.

Proposition 3 (Boolean combination of predicates). *Let* ϕ *be a Boolean combination of predicates* ϕ1, ..., ϕk*, which are computed by BCPs* P1, ...,P<sup>k</sup>*, respectively, within* O(n log n) *interactions. Then there is a protocol computing* ϕ *within* O(n log n) *interactions.*

*Proof (sketch).* We do a simple parallel composition of the k BCPs, which is the same construction as used for ordinary population protocols (see for example [5, Lemma 6]). A detailed proof can be found in the full version of this paper.

### 6 Protocols for all Predicates in ZPL

BCPs compute precisely the predicates in NL with input encoded in unary, which corresponds to NSPACE(n) when encoded in binary. The proof of the NL lower bound by Blondin, Esparza and Jaax [14] goes through multiple stages of reduction and thus does not reveal which predicates can be computed *efficiently*. We will now take a more direct approach, using a construction similar to the one by Angluin, Aspnes and Eisenstat [7]. A step of a randomised Turing machine (RTM) can be simulated using variants of the protocols for Presburger predicates from Section 5, which we combine with a clock to determine whether the step has finished, with high probability.

Instead of simulating RTMs directly, it is more convenient to first reduce them to counter machines. Here, we will use counter machines that are both randomised and capable of multiplying and dividing by two, with the latter also determining the remainder. This ensures that the reduction is performed efficiently, i.e. with overhead of O(n log n) interactions per step.

We first show the other direction: simulating BCPs with RTMs.

Lemma 1. *Polynomial-time BCPs compute at most the predicates in* ZPL *with input encoded in unary.*

*Proof.* An RTM can store the number of agents in each state as binary counters. Picking an agent uniformly at random can be done in O(log n) time by picking a random number between 1 and n and comparing it to the agents in the different states. Simulating a transition can also be done with logarithmic overhead. It can further be shown that stabilization of the execution is decidable in time O(log n) (see the full version of this paper for details). As the BCP uses only O(poly n) interactions (in expectation) the RTM is also O(poly n) time-bounded.

Theorem 2. *Polynomial-time BCPs compute exactly the predicates in* ZPL *with input encoded in unary.*

The proof of Theorem 2 will take up the remainder of this section.

Counter machines. Let Cmd def = {mul2, inc, divmod2, iszero} denote a set of commands, and Ret def = {done0, done1} a set of completion statuses. A *multiplicative counter machine with* k *counters (*k*-CM)* A = (S, T1, T2) consists of a finite set of states S with init, 0, 1 ∈ S and two transition functions T1, T<sup>2</sup> mapping a state q ∈ S to a tuple (i, j, q 0, q <sup>1</sup>) where i ∈ {1, ..., k} refers to a counter, j ∈ Cmd is a command, and q 0, q <sup>1</sup> ∈ S are successor states (q <sup>1</sup> is not used for mul<sup>2</sup> and inc operations). Additionally, we require that T1, T<sup>2</sup> map q ∈ {0, 1} to (1, iszero, q, q), effectively executing no operation from those states.

The idea is that A, starting in state init, picks transitions uniformly at random from either T<sup>1</sup> or T2. Apart from this randomness, the transitions are deterministic. Eventually, A ends up in either state 0 or 1, at which point it cannot perform further actions, thereby indicating whether the input is accepted or rejected.

Step-execution function. A *CM-configuration* is a tuple K = (q, x1, ..., xk) ∈ <sup>Q</sup> <sup>×</sup> <sup>N</sup><sup>k</sup>. We define the *step-execution function* step as follows, with <sup>x</sup> <sup>∈</sup> <sup>N</sup>:

$$-\,\,\mathrm{step}(\mathsf{mul}\_2, x)\stackrel{\mathrm{def}}{=}(\mathsf{done}\_0, 2x),$$

– step(inc, x) def = (done0, x + 1),


For two CM-configurations K = (q, x1, ..., xk) and K = (q , x 1, ..., x <sup>k</sup>) where T◦(q)=(i, j, q 0, q <sup>1</sup>) for ◦∈{1, 2} we write K ◦ −→ <sup>K</sup> if step(j, xi)=(doneb, x i), q = q <sup>b</sup> for some b ∈ {0, 1}, and x<sup>r</sup> = x <sup>r</sup> for r = i. Note that for each K and ◦ there is exactly one K with K ◦ −→ <sup>K</sup> .

The reasoning for introducing the step-execution function is that we want to construct a broadcast protocol (BP) which simulates just one step of the CM. Later on we can use this BP as a building block in a more general protocol.

Computation. Let <sup>ϕ</sup> : <sup>N</sup><sup>l</sup> → {0, <sup>1</sup>} denote a predicate, for <sup>l</sup> <sup>≤</sup> <sup>k</sup>, and <sup>C</sup> <sup>∈</sup> N<sup>l</sup> an input to ϕ. We sample a *random (CM-)execution* π = K0K1K2... *for input* C, where K0, ... are CM-configurations, via a Markov chain. For the initial configuration we have K<sup>0</sup> def = (init, C(1), ..., C(l), 0, ..., 0), and K<sup>i</sup> is determined as the unique configuration with K<sup>i</sup>−<sup>1</sup> ◦ −→ <sup>K</sup>i, where ◦∈{1, <sup>2</sup>} is chosen uniformly at random. (So π is the random variable defined as trace of the Markov Chain.)

We say that <sup>A</sup> *computes* <sup>ϕ</sup> *within* <sup>f</sup>(n) *steps* if for each <sup>C</sup> <sup>∈</sup> <sup>N</sup><sup>l</sup> with <sup>|</sup>C<sup>|</sup> <sup>=</sup> <sup>n</sup> the random execution for input <sup>C</sup> reaches a configuration in {ϕ(C)}×N<sup>k</sup> after at most f(n) steps in expectation. Finally, A is n*-bounded* if the random executions for inputs <sup>C</sup> with <sup>|</sup>C<sup>|</sup> <sup>=</sup> <sup>n</sup> can only reach configurations in <sup>Q</sup> <sup>×</sup> <sup>N</sup><sup>k</sup> ≤n.

Theorem 3. *Let* ϕ *be a predicate decidable by a log-space bounded RTM within* O(f(n)) *steps in expectation with unary input encoding. There exists an* n*bounded CM that accepts* ϕ *within* O(f(n) log(n)) *steps in expectation.*

*Proof (sketch).* This can be shown by first representing the Turing machine by a stack machine with two stacks that contain the tape content to the left/right of the current machine head position. In this representation, head movements and tape updates amount to performing pop/push operations on the stack. Moreover, we can simulate an c · n-bounded stack by c many n-bounded stacks. An nbounded stack, in turn, can be represented in a counter machine with a constant number of 2<sup>n</sup>-bounded counters. The stack content is represented as the base-2 number corresponding to the binary sequence stored in the stack. Popping then amounts to a divmod<sup>2</sup> operation, and pushing amounts to doubling the counter value, followed by adding 1 or 0, respectively.

A detailed proof can be found in the full version of this paper.

We formally define two types of BPs, ones that simulate a step of the CM, and ones behaving like a clock.

Definition 1. *Let BP* P = (Q × G, δ) *denote a BP with global states* G *where* <sup>0</sup>, <sup>1</sup>, ⊥∈ <sup>Q</sup> *and* Cmd, Ret <sup>⊆</sup> <sup>G</sup>*. We define the injection* <sup>ϕ</sup> : <sup>G</sup> <sup>×</sup> <sup>N</sup>≤<sup>n</sup> <sup>→</sup> <sup>N</sup><sup>Q</sup>×<sup>G</sup> *as* ϕ(j, x) def = x · -(1, j) + (<sup>n</sup> <sup>−</sup> <sup>x</sup>) · -(0, j)*. The configurations in* <sup>ϕ</sup>(Cmd <sup>×</sup> <sup>N</sup>) *are called* initial*, the ones in* <sup>ϕ</sup>(Ret <sup>×</sup> <sup>N</sup>) final*. We call a configuration* <sup>C</sup> failing*, if* C(⊥, i) > 0 *for some* i ∈ G*.*

*We say that* P *is* CM-simulating *if the sets of final and failing configurations are closed under reachability, and from every initial configuration* ϕ(j, w) *the only reachable final configuration is* ϕ(step(j, w))*, if both are well-defined.*

Definition 2. *Let* P = (Q, δ) *denote a BP with* 0, 1 ∈ Q *and* Time(P) *the number of steps until* P*, starting in configuration* -<sup>0</sup>, ..., <sup>0</sup>*, reaches* -<sup>1</sup>, ..., <sup>1</sup>*, or* ∞ *if it does not. If* Time(P) *is almost surely finite and no agent is in state* 1 *before* Time(P)*, then we call* P *a* clock-BP*.*

Now we begin by constructing a CM-simulating BP. The value of a given counter is scattered across the population: each agent stores its contribution to this counter value in its state. The counter value is the sum of all contributions. Usually, an agent's contribution is either 1 or 0, thus n agents can maximally store a counter value equal to n, which is not problematic, since the counter machine is assumed to be n-bounded. The difficult part is multiplying and dividing the counter by two. Besides contributions 0 and 1, we will also allow intermediate contributions <sup>1</sup> <sup>2</sup> and 2. By executing a single broadcast, we can multiply (or divide) all the individual contributions by 2, by setting all contributions of value 1 to <sup>1</sup> <sup>2</sup> , or 2, respectively. Then, over time, we "normalise" the agents to all have contribution 0 or 1 again in a manner which is specified below. This process takes some time, and we cannot determine with perfect reliability whether it is finished, so we only bound the time with high probability. Here and in the following, we say that some event (dependent on the population size n) happens *with high probability*, if for *all* k > <sup>0</sup> the event happens with probability <sup>1</sup> − O(n−<sup>k</sup>).

In this and subsequent lemmata we use G(p), for 0 <p< 1, to denote the geometric distribution, that is the number of *trials* until a coin flip with probability p succeeds, which has expectation 1/p. We start with a statement about the tail distributions of sums of geometric variables.

Lemma 2. *Let* n ≥ 3 *and* X1, ..., X<sup>n</sup> *denote independent random variables with sum* X *and* X<sup>i</sup> ∼ G(i/n)*. Then for any* k ≥ 1 *there is an* l *s.t.*

$$\mathbb{P}(X \ge l \cdot n \ln n) \le n^{-k}$$

*Proof.* See the full version of this paper.

Lemma 3. *There is a CM-simulating BP s.t. starting from an initial configuration it reaches a final configuration within* O(n log n) *steps with high probability.*

*Proof.* Let <sup>P</sup> = (<sup>Q</sup> <sup>×</sup> G, δ) denote our BP, with <sup>Q</sup> def <sup>=</sup> {0, <sup>1</sup> <sup>2</sup> , <sup>1</sup>, <sup>2</sup>, ∗} and <sup>G</sup> def = Cmd ∪ Ret ∪ {high}. The following transitions initialise the computation, with b ∈ {0, 1}:

$$(b, \mathsf{mult}\_2) \mapsto (2b, \mathsf{done}\_0), \{1 \mapsto 2, 0 \mapsto 0\} \tag{\alpha\_1}$$

$$(b, \mathsf{div}\mathsf{mod}\_2) \mapsto (\frac{b}{2}, \mathsf{d}\mathsf{on}\mathsf{e}\_0), \left\{1 \mapsto \frac{1}{2}, 0 \mapsto 0\right\} \tag{\alpha\_2}$$

$$(b, \mathsf{inc}) \mapsto (b, \mathsf{high}), \; \emptyset \tag{\alpha\_3}$$

Additionally, we need transitions that move agents back into states 0 and 1.

$$(0, \mathsf{high}) \mapsto (1, \mathsf{done}\_0), \emptyset \tag{\beta\_1}$$

$$(2, \mathsf{done}\_0) \mapsto (1, \mathsf{high}), \Downarrow \tag{\beta\_2}$$

$$(\frac{1}{2}, \mathsf{done}\_{0}) \mapsto (0, \mathsf{done}\_{1}), \emptyset \tag{\beta\_{3}}$$

$$(\frac{1}{2}, \mathsf{done}\_1) \mapsto (1, \mathsf{done}\_0), \emptyset \tag{\beta\_4}$$

This requires some explanation. Basically, we have the invariant that for a configuration C the current value of the counter is b + - <sup>i</sup>∈Q,j∈<sup>G</sup> <sup>i</sup> · <sup>C</sup>((i, j)), where b is 1 if the global state is high and 0 else. There is a "canonical" representation of each counter value, where b = 0 and the individual contributions i ∈ Q are only 0 and 1. The transitions (α1-α3) update the represented counter value in a single step, but cause a "noncanonical" representation. The transitions (β1 β4) preserve the value of the counter and cause the representation to eventually become canonical.

This corresponds to final configurations from Definition 1: as long as the representation is noncanonical, i.e. an agent with value <sup>1</sup> <sup>2</sup> , 2 or ∗ exists, the configuration is not final. Conversely, once we reach a final configuration our representation is canonical, and, as the value of the counter is preserved, we reach the correct final configuration.

$$(1, \texttt{iszero}) \mapsto (1, \texttt{done}\_1), \varnothing \tag{}$$

$$(0, \texttt{iszero}) \mapsto (0, \texttt{done}\_0), \{1 \mapsto \*\} \tag{\alpha\_5}$$

$$(\*, \mathsf{done}\_0) \mapsto (1, \mathsf{done}\_1), \{\* \mapsto 1\} \tag{\beta\_5}$$

For iszero we do something similar, but the value of the counter does not change. If the initial transition is executed by an agent with value 1, we can go to the global state done<sup>1</sup> directly. Otherwise, we replace 1 by ∗ and go to done0, so if no agents with value 1 exist, we are finished. Else some agent with value ∗ executes (β5) and we move to the correct final configuration.

Final configurations can only contain states {0, 1} × Ret. As we have no outgoing transitions from those states, they are indeed closed under reachability.

It remains to be shown that starting from a configuration C<sup>0</sup> we reach a final configuration within O(n log n) steps with high probability. Note that transitions (α1-α5) are executed at most once. Moreover, these are the only transitions enabled at C0, so let C<sup>1</sup> denote the successor configuration after executing (α1 α5), i.e. C<sup>0</sup> → C1. From now on, we consider only transitions (β1-β5).

Let M def <sup>=</sup> { <sup>1</sup> <sup>2</sup> , 2, ∗} × G denote the set of "noncanonical" states, and, for a configuration C, let Φ(C) def = 2- <sup>q</sup>∈<sup>M</sup> <sup>C</sup>(q) + <sup>b</sup> denote a potential function, with b being 1 if the global state of C is high and 0 else. Now we can observe that executing a (β1-β5) transition strictly decreases Φ, and that 0 ≤ Φ(C) ≤ 2n for any configuration C. So after at most 2n non-silent transitions, we have reached a final configuration.

Fix some transition (β<sup>j</sup> ), let q ∈ Q × G denote the state initiating (β<sup>j</sup> ), and let C, C , C denote configurations with C <sup>β</sup><sup>j</sup> −→ <sup>C</sup> <sup>∗</sup> −→ <sup>C</sup>, meaning that C is a configuration reachable from C after executing (β<sup>j</sup> ). Then, we claim, C(q) > C(q).

To see that this holds for transitions (β2-β5), note that for <sup>i</sup> ∈ { <sup>1</sup> <sup>2</sup> , 2, ∗} the number of agents with value i can only decrease when executing transitions (β1 β5). For (β1) this is slightly more complicated, as (β3) increases the number of agents with value 0. However, (β1) is reachable only after (α1) or (α3) has been executed, while (β3) requires (α2). Thus, our claim follows.

Fig. 3. State diagram of the clock implementation. Nodes with i agents in state c<sup>3</sup> are labelled i or i <sup>+</sup>, the latter denoting that the other agents are in states c<sup>+</sup> <sup>1</sup> and c<sup>+</sup> <sup>2</sup> . The final state ∗ has all agents in state 1. Arcs are labelled with transition probabilities.

Let X<sup>k</sup> denote the number of silent transitions before executing (β<sup>j</sup> ) for the k-th time, k = 1, ..., l, and let r<sup>k</sup> denote the number of agents in state q at that time. Then n ≥ r<sup>1</sup> > r<sup>2</sup> > ... > r<sup>l</sup> ≥ 1 and X<sup>k</sup> is distributed according to G(rk/n). So we can use Lemma 2 to show that the sum of X<sup>k</sup> is O(n log n) with high probability. There are only 5 transitions (β<sup>j</sup> ), so the same holds for the total number of steps until reaching a final state.

Our next construction is the clock-BP, which indicates that some amount of time has passed (with high probability). Angluin, Aspnes and Eisenstat used epidemics for this purpose [7], as do we. The idea is that one agent initiates an epidemic and waits until it sees an infected agent. Similar to standard analysis of the coupon collector's problem, this is likely to take Θ(n log n) time.

Lemma 4. *There is a clock-BP* <sup>P</sup> = (Q, δ) *s.t.* <sup>E</sup>(Time(P)) ∈ O(<sup>n</sup> log <sup>n</sup>) *and* Time(P) <sup>∈</sup> <sup>Ω</sup>(<sup>n</sup> log <sup>n</sup>) *with probability* <sup>1</sup> − O(n−1/<sup>2</sup>)*.*

*Proof (sketch).* For a clock we use states {0, <sup>1</sup>, c1, c2, c3, c<sup>+</sup> <sup>1</sup> , c<sup>+</sup> <sup>2</sup> } and transitions

$$0 \mapsto c\_1^+, \{0 \mapsto c\_2^+\} \tag{\alpha}$$

$$c\_2^+ \mapsto c\_3, \{c\_2^+ \mapsto c\_2, c\_1^+ \mapsto c\_1\} \tag{\beta}$$

$$c\_3 \mapsto c\_3, \{c\_2 \mapsto c\_2^+, c\_1 \mapsto c\_1^+\} \tag{\gamma}$$

$$c\_1^+ \mapsto 1, \{c\_2^+ \mapsto 1, c\_3 \mapsto 1\} \tag{\omega}$$

State 0 is the initial state, 1 the final state. States c<sup>1</sup> and c<sup>2</sup> denote "uninfected" agents, state c<sup>3</sup> "infected" ones. The former can become activated (moving to c<sup>+</sup> <sup>1</sup> and c<sup>+</sup> <sup>2</sup> ), causing one of them to become infected. Transition (α) marks a leader c1, once they are infected the clock ends (via (ω)). In (β), a single activated agent becomes infected, deactivating the other agents. They get activated again via transition (γ). The state diagram is shown in Figure 3.

It remains to show that this protocol fulfils the stated time bounds. We prove <sup>E</sup>(Time(P)) ∈ O(<sup>n</sup> log <sup>n</sup>) by using that, in expectation, the protocol spends at most n/j steps in state <sup>j</sup> and at most n/(n−j) in state <sup>j</sup><sup>+</sup>. For the lower bound we make a case distinction: either state <sup>√</sup>n is not visited (i.e. the leader is one of the first √n agents to be infected), or the total number of steps is at least X<sup>1</sup> +...+ X <sup>√</sup>n, where X<sup>j</sup> is the number of steps the protocol spends in state i. As X<sup>j</sup> is geometrically distributed with mean n/j, we apply a tail bound from Janson [23] to get the desired result.

A detailed proof can be found in the full version of the paper.

While the above clock measures some interval of time with some reliability, we want a clock that measures an "arbitrarily long" interval with "arbitrarily high" reliability. Constructions for population protocols use phase clocks for this purpose, but broadcasts allow us to synchronise the agents, so we can directly execute the clock multiple times in sequence instead.

Lemma 5. *Let* <sup>k</sup> <sup>∈</sup> <sup>N</sup> *denote some constant. Then there is a clock-BP* <sup>P</sup> *s.t.* <sup>E</sup>(Time(P)) ∈ O(<sup>n</sup> log <sup>n</sup>)*, and* Time(P) < kn log <sup>n</sup> *with probability* <sup>O</sup>(n−<sup>k</sup>)*.*

*Proof (sketch).* The idea is that we run 28k<sup>2</sup> clocks in sequence, in groups of 2k. Then it is likely that at least one clock in each group works, yielding the overall minimum running time. A detailed proof can be found in the full version of this paper.

As mentioned earlier, we combine the clock with the construction in Lemma 3. While we cannot reliably determine whether the operation has finished, we can use a clock to measure an interval of time long enough for the protocol to terminate with high probability. The next construction does just that. In particular, in contrast to Lemma 3, it uses its global state to indicate that it is done.

Lemma 6. *There is a CM-simulating BP s.t. starting from an initial configuration it reaches either a final or a failing configuration* C *almost surely and within* O(n log n) *steps in expectation, and* C *is final with high probability. Additionally, all reachable configurations with global state in* Ret *are final or failing.*

*Proof.* Fix some <sup>k</sup> <sup>∈</sup> <sup>N</sup> and let <sup>P</sup> = (<sup>Q</sup> <sup>×</sup> G, δ) denote the BP we want to construct. Further, let P<sup>1</sup> = (Q<sup>1</sup> × G1, δ1) denote the BP from Lemma 3 and choose some c s.t. P<sup>1</sup> reaches a final configuration after at most cn log n steps with probability at least <sup>1</sup> <sup>−</sup> <sup>n</sup>−<sup>k</sup>.

Now we use Lemma 5 to get a clock P<sup>2</sup> = (Q2, δ2) that runs for at least cn log <sup>n</sup> steps with probability at least <sup>1</sup> <sup>−</sup> <sup>n</sup>−<sup>k</sup>.

We do a parallel composition of <sup>P</sup><sup>1</sup> and <sup>P</sup><sup>2</sup> to get <sup>P</sup>. In particular, <sup>Q</sup> def = <sup>Q</sup><sup>1</sup> <sup>×</sup> <sup>Q</sup>2, <sup>G</sup> def = {j◦ : j ∈ G1} ∪ Ret, where for Q we identify (i, 0) with i for i ∈ {0, 1 ⊥}, and for G we identify j with j◦ for j ∈ Cmd.

Intuitively, we use ◦ to rename the global states of P1, meaning that the global state j ∈ G<sup>1</sup> of P<sup>1</sup> is now called j◦ in our protocol. We want P<sup>1</sup> to start with the same initial state we have, which is why we identified j with j◦ for j ∈ Cmd. However, we only want to enter a final configurations once the clock has run out, so the completion statuses of P<sup>1</sup> are renamed into j◦ for j ∈ Ret and we enter a final configuration by setting to global state to a j ∈ Ret.

For each (q1, j) ∈ Q<sup>1</sup> × G<sup>1</sup> and q<sup>2</sup> ∈ Q<sup>2</sup> with δ1(q1, j) = ((r1, j ), f1) and δ2(q2)=(r2, f2) we get the transition

$$(q\_1, q\_2, j\_\diamondsuit) \mapsto (r\_1, r\_2, j\_\diamondsuit), \ \{(t\_1, t\_2) \mapsto (f\_1(t\_1), f(t\_2)) : t\_1 \in Q\_1, t\_2 \in Q\_2\} \qquad (\alpha)$$

These transitions, together with the way we identified states, ensure that P<sup>1</sup> and P<sup>2</sup> run normally, with the input being passed through to P<sup>1</sup> transparently. However, note that the final configurations of P<sup>1</sup> are not final for P, meaning that the protocol never ends. Hence, for q<sup>1</sup> ∈ Q1, j ∈ Ret we add the transition

$$\begin{aligned} (q\_1, 1, j\_\circ) \mapsto (q\_1, 0, j), \{ (b, 1) \mapsto (b, 0) : b \in \{0, 1\} \} \\ \cup \{ (i, 1) \mapsto (\bot, 0) : i \in Q\_1 \; \; \; \{0, 1\} \} \end{aligned} \tag{\beta}$$

This terminates the protocol once the clock has run out. If P<sup>1</sup> was in a final state, we will now enter a final state as well, else we move into a failing state.

Finally, we use the above BP to simulate the full l-CM.

Lemma 7. *Fix some predicate* <sup>ϕ</sup> : <sup>N</sup><sup>k</sup> → {0, <sup>1</sup>} *computable by an* <sup>n</sup>*-bounded* l*-CM within* O(f(n)) ⊆ O(poly n) *steps. Then there is a BCP computing* ϕ *in* O(f(n) n log n) *steps.*

*Proof (sketch).* For each counter we need n agents, so ln in total, but we can simply have each agent simulate a constant number of agents. To execute a step of the CM, we use the BP from Lemma 6. It succeeds only with high probability, but in the case of failure at least one agent will have local state ⊥, from which that agent initiates a restart of the whole computation.

As the CM takes only a polynomial number of steps, we can fix a k s.t. a computation of our BCP without failures (i.e. one that succeeds on the first try) takes <sup>O</sup>(n<sup>k</sup>) steps. A single step succeeds with high probability, so we can require it to fail with probability at most <sup>O</sup>(n−k−<sup>1</sup>). In total, the restarts increase the running time by a factor of <sup>1</sup>/(1 − O(n−<sup>1</sup>)), which is only a constant overhead.

A detailed proof can be found in the full version of this paper.

This completes the proof of Theorem 2. By Theorem 3, each predicate in ZPL (with input encoded in unary) is computable by a bounded l-CM. Lemma 7 then yields a polynomial-time BCP for that predicate.

We remark that our reductions also enable us to construct efficient BPPs for specific predicates. The predicate PowerOfTwo for example, as described in [14, Proposition 3], can trivially be decided by an O(log n)-time bounded RTM with input encoded as binary, so there is also a BCP computing that predicate within <sup>O</sup>(<sup>n</sup> log<sup>2</sup> n) interactions.

#### References

1. Alistarh, D., Aspnes, J., Eisenstat, D., Gelashvili, R., Rivest, R.L.: Time-space trade-offs in population protocols. In: Proceedings of the twenty-eighth annual ACM-SIAM symposium on discrete algorithms. pp. 2560–2579. SIAM (2017)


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Leafy automata for higher-order concurrency**

Alex Dixon<sup>1</sup> (-), Ranko Lazi´c<sup>2</sup> , Andrzej S. Murawski<sup>3</sup> , and Igor Walukiewicz<sup>4</sup>

<sup>1</sup> University of Warwick, Coventry, UK, alexander.dixon@warwick.ac.uk <sup>2</sup> University of Warwick, Coventry, UK

<sup>3</sup> University of Oxford, Oxford, UK

<sup>4</sup> CNRS, Universit´e de Bordeaux, Talence, France

**Abstract.** Finitary Idealized Concurrent Algol (FICA) is a prototypical programming language combining functional, imperative, and concurrent computation. There exists a fully abstract game model of FICA, which in principle can be used to prove equivalence and safety of FICA programs. Unfortunately, the problems are undecidable for the whole language, and only very rudimentary decidable sub-languages are known.

We propose leafy automata as a dedicated automata-theoretic formalism for representing the game semantics of FICA. The automata use an infinite alphabet with a tree structure. We show that the game semantics of any FICA term can be represented by traces of a leafy automaton. Conversely, the traces of any leafy automaton can be represented by a FICA term. Because of the close match with FICA, we view leafy automata as a promising starting point for finding decidable subclasses of the language and, more generally, to provide a new perspective on models of higher-order concurrent computation.

Moreover, we identify a fragment of FICA that is amenable to verification by translation into a particular class of leafy automata. Using a locality property of the latter class, where communication between levels is restricted and every other level is bounded, we show that their emptiness problem is decidable by reduction to Petri net reachability.

**Keywords:** Finitary Idealized Concurrent Algol, Higher-Order Concurrency, Automata over Infinite Alphabets, Game Semantics

#### **1 Introduction**

Game semantics is a versatile paradigm for giving semantics to a wide spectrum of programming languages [3,35]. It is well-suited for studying the observational equivalence of programs and, more generally, the behaviour of a program in an arbitrary context. About 20 years ago, it was discovered that the game semantics of a program can sometimes be expressed by a finite automaton or another simple computational model [20]. This led to algorithmic uses of game semantics for program analysis and verification [1,15,21,5,27,26,28,34,16,17]. Thus far, these advances concerned mostly languages without concurrency.

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 184–204, 2021. https://doi.org/10.1007/978-3-030-71995-1 10

In this work, we consider Finitary Idealized Concurrent Algol (FICA) and its fully abstract game semantics [22]. It is a call-by-name language with higherorder features, side-effects, and concurrency implemented by a parallel composition operator and semaphores. It is finitary since, as it is common in this context, base types are restricted to finite domains. Quite surprisingly, the game semantics of this language is arguably simpler than that for the language without concurrency. The challenge comes from algorithmic considerations.

Following the successful approach from the sequential case [20,37,33,36,11], the first step is to find an automaton model abstracting the phenomena appearing in the semantics. The second step is to obtain program fragments from structural restrictions on the automaton model. In this paper we take both steps.

We propose leafy automata: an automaton model working on nested data. Data are used to represent pointers in plays, while the nesting of data reflects structural dependencies in the use of pointers. Interestingly, the structural dependencies in plays boil down to imposing a tree structure on the data. We show a close correspondence between the automaton model and the game semantics of FICA. For every program, there is a leafy automaton whose traces (data words) represent precisely the plays in the semantics of the program (Theorem 3). Conversely, for every leafy automaton, there is a program whose semantics consists of plays representing the traces of the automaton (Theorem 5). (The latter result holds modulo a saturation condition we explain later.) This equivalence shows that leafy automata are a suitable model for studying decidability questions for FICA.

Not surprisingly, due to their close connection to FICA, leafy automata turn out to have an undecidable emptiness problem. We use the undecidability argument to identify the source, namely communication across several unbounded levels, i.e., levels in which nodes can produce an unbounded number of children during the lifetime of the automaton. To eliminate the problem, we introduce a restricted variant of leafy automata, called local, in which every other level is bounded and communication is allowed to cross only one unbounded node. Emptiness for such automata can be decided via reduction to a number of instances of Petri net reachability problem.

We also identify a fragment of FICA, dubbed local FICA (LFICA), which maps onto local leafy automata. It is based on restricting the distance between semaphore and variable declarations and their uses inside the term. This is a first non-rudimentary fragment of FICA for which some verification tasks are decidable. Overall, this makes it possible to use local leafy automata to analyse LFICA terms and decide associated verification tasks.

Related work Concurrency, even with only first-order recursion, leads to undecidability [39]. Intuitively, one can encode the intersection of languages of two pushdown automata. From the automata side, much research on decidable cases has concentrated on bounding interactions between stacks representing different threads of the program [38,30,4]. From the game semantics side, the only known decidable fragment of FICA is Syntactic Control of Concurrency (SCC) [23], which imposes bounds on the number of threads in which arguments can be used. This restriction makes it possible to represent the game semantics of programs by finite automata. In our work, we propose automata models that correspond to unbounded interactions with arbitrary FICA contexts, and importantly that remains true also when we restrict the terms to LFICA. Leafy automata are a model of computation over an infinite alphabet. This area has been explored extensively, partly motivated by applications to database theory, notably XML [41]. In this context, nested data first appeared in [7], where the authors considered shuffle expressions as the defining formalism. Later on, data automata [9] and class memory automata [8] have been adapted to nested data in [14,12]. They are similar to leafy automata in that the automaton is allowed to access states related to previous uses of data values at various depths. What distinguishes leafy automata is that the lifetime of a data value is precisely defined and follows a question and answer discipline in correspondence with game semantics. Leafy automata also feature run-time "zero-tests", activated when reading answers.

For most models over nested data, the emptiness problem is undecidable. To achieve decidability, the authors in [14,12] relax the acceptance conditions so that the emptiness problem can eventually be recast as a coverability problem for a well-structured transition system. In [10], this result was used to show decidability of equivalence for a first-order (sequential) fragment of Reduced ML. On the other hand, in [7] the authors relax the order of letters in words, which leads to an analysis based on semi-linear sets. Both of these restrictions are too strong to permit the semantics of FICA, because of the game-semantic WAIT condition, which corresponds to waiting until all sub-processes terminate.

Another orthogonal strand of work on concurrent higher-order programs is based on higher-order recursion schemes [24,29]. Unlike FICA, they feature recursion but the computation is purely functional over a single atomic type o.

Structure of the paper: In the next two sections we recall FICA and its game semantics from [22]. The following sections introduce leafy automata (LA) and their local variant (LLA), where we also analyse the associated decision problems and, in particular, show that the non-emptiness problem for LLA is decidable. Subsequently, we give a translation from FICA to LA (and back) and define a fragment LFICA of FICA which can be translated into LLA. We will occasionally refer the reader to the full paper [18] which includes appendices with proof details and worked examples.

#### **2 Finitary Idealized Concurrent Algol (FICA)**

Idealized Concurrent Algol [22] is a paradigmatic language combining higherorder with imperative computation in the style of Reynolds [40], extended to concurrency with parallel composition (||) and binary semaphores. We consider its finitary variant FICA over the finite datatype {0,..., max} (max ≥ 0) with loops but no recursion. Its types θ are generated by the grammar

$$\theta ::= \beta \mid \theta \to \theta \qquad\qquad \beta \coloneqq \mathbf{com} \mid \mathbf{exp} \mid \mathbf{var} \mid \mathbf{sem}$$


#### Fig. 1: FICA typing rules

where **com** is the type of commands; **exp** that of {0,..., max }-valued expressions; **var** that of assignable variables; and **sem** that of semaphores. The typing judgments are displayed in Figure 1. **skip** and **div**<sup>θ</sup> are constants representing termination and divergence respectively, i ranges over {0, ··· , max }, and **op** represents unary arithmetic operations, such as successor or predecessor (since we work over a finite datatype, operations of bigger arity can be defined using conditionals). Variables and semaphores can be declared locally via **newvar** and **newsem**. Variables are dereferenced using !M, and semaphores are manipulated using two (blocking) primitives, **grab**(s) and **release**(s), which grab and release the semaphore respectively. The small-step operational semantics of FICA is reproduced in the full paper [18, Appendix A]. We shall write **div** for **divcom**.

We are interested in contextual equivalence of terms. Two terms are contextually equivalent if there is no context that can distinguish them with respect to may-termination. More formally, a term M : **com** is said to terminate, written M ⇓, if there exists a terminating evaluation sequence from M to **skip**. Then contextual (may-)equivalence (Γ M<sup>1</sup> ∼= M2) is defined by: for all contexts C such that C[M] : **com**, C[M1]⇓ if and only if C[M2]⇓. The force of this notion is quantification over all contexts.

Since contextual equivalence becomes undecidable for FICA very quickly [23], we will look at the special case of testing equivalence with terms that always diverge, e.g. given Γ M : θ, is it the case that Γ M ∼= **div**θ? Intuitively, equivalence with an always-divergent term means that C[M] will never converge (must diverge) if C uses M. At the level of automata, this will turn out to correspond to the emptiness problem.

In verification tasks, with the above equivalence test, we can check whether uses of M can ever lead to undesirable states. For example, for a given term x : **var** M : θ, the term

$$f: \theta \to \mathbf{com} \vdash \mathbf{newvar} \, x := 0 \, \text{in} \, (f(M) \, || \, \text{if} \, !x = 13 \, \text{then} \, \text{skip} \, \text{else} \, \text{div})$$

will be equivalent to **div** only when x is never set to 13 during a terminating execution. Note that, because of quantification over all contexts, f may use M an arbitrary number of times, also concurrently or in nested fashion, which is a very expressive form of quantification.

### **3 Game semantics**

Game semantics for programming languages involves two players, called Opponent (O) and Proponent (P), and the sequences of moves made by them can be viewed as interactions between a program (P) and a surrounding context (O). In this section, we briefly present the fully abstract game model for FICA from [22], which we rely on in the paper. The games are defined using an auxiliary concept of an arena.

**Definition 1.** An arena A is a triple MA, λA, A where:


We shall write I<sup>A</sup> for the set of all moves of A which have no enabler; such moves are called initial. Note that an initial move must be an Opponent question. In arenas used to interpret base types all questions are initial and P-moves answering them are detailed in the table below, where i ∈ {0, ··· , max }.


More complicated types are interpreted inductively using the product (A × B) and arrow (A ⇒ B) constructions, given below.

$$\begin{array}{lll} M\_{A \times B} = M\_A + M\_B & M\_{A \Rightarrow B} = M\_A + M\_B\\ \lambda\_{A \times B} = [\lambda\_A, \lambda\_B] & \lambda\_{A \Rightarrow B} = [\langle \lambda\_A^{PO}, \lambda\_A^{QA} \rangle, \lambda\_B] \\\ \vdash\_{A \times B} = \vdash\_A + \vdash\_B & \vdash\_{A \Rightarrow B} = \vdash\_A + \vdash\_B + \{ (b, a) \mid b \in I\_B \text{ and } a \in I\_A \} \end{array}$$

where λP O <sup>A</sup> (m) = O iff λOP <sup>A</sup> (m) = P. We write θ for the arena corresponding to type θ. Below we draw (the enabling relations of) A<sup>1</sup> = **com** → **com** → **com** and A<sup>2</sup> = (**var** → **com**) → **com** respectively, using superscripts to distinguish copies of the same move (the use of superscripts is consistent with our future use of tags in Definition 9).

Given an arena A, we specify next what it means to be a legal play in A. For a start, the moves that players exchange will have to form a justified sequence, which is a finite sequence of moves of A equipped with pointers. Its first move is always initial and has no pointer, but each subsequent move n must have a unique pointer to an earlier occurrence of a move m such that m <sup>A</sup> n. We say that n is (explicitly) justified by m or, when n is an answer, that n answers m. If a question does not have an answer in a justified sequence, we say that it is pending in that sequence. Below we give two justified sequences from A<sup>1</sup> and A<sup>2</sup> respectively.

$$
\xleftarrow{\text{run\\_run^1\\_run^2\\_one^1}}
\xleftarrow{\text{done\\_read^2}}
\text{done}^{\text{run^1}}
\text{read^{11}}
\text{write^{11}}
\text{write^{11}}
\text{read^{11}}
\text{read^{11}}
$$

Not all justified sequences are valid. In order to constitute a legal play, a justified sequence must satisfy a well-formedness condition that reflects the "static" style of concurrency of our programming language: any started sub-processes must end before the parent process terminates. This is formalised as follows, where the letters q and a to refer to question- and answer-moves respectively, while m denotes arbitrary moves.

**Definition 2.** The set P<sup>A</sup> of plays over A consists of the justified sequences s over A that satisfy the two conditions below.


It is easy to check that the justified sequences given above are plays. A subset σ of P<sup>A</sup> is O-complete if s ∈ σ and so ∈ P<sup>A</sup> imply so ∈ σ, when o is an O-move.

**Definition 3.** A strategy on A, written σ : A, is a prefix-closed O-complete subset of PA.

Suppose Γ = {x<sup>1</sup> : θ1, ··· , x<sup>l</sup> : θl} and Γ M : θ is a FICA-term. Let us write Γ θ for the arena θ1 ×···× θl ⇒ θ. In [22] it is shown how to assign a strategy on Γ θ to any FICA-term Γ M : θ. We write Γ M to refer to that strategy. For example, Γ **div** = {,run} and Γ **skip** = {,run,run done}. Given a strategy σ, we denote by comp(σ) the set of nonempty complete plays of σ, i.e. those in which all questions have been answered. The game-semantic interpretation ··· turns out to provide a fully abstract model in the following sense.

**Theorem 1 ([22]).** Γ M<sup>1</sup> ∼= M<sup>2</sup> iff comp(Γ M1) = comp(Γ M2).

In particular, since we have comp(Γ **div**θ) = ∅, Γ M : θ is equivalent to **div**<sup>θ</sup> iff comp(Γ M) = ∅.

### **4 Leafy automata**

We would like to be able to represent the game semantics of FICA using automata. To that end, we introduce leafy automata (LA). They are a variant of automata over nested data, i.e. a type of automata that read finite sequences of letters of the form (t, d0d<sup>1</sup> ··· d<sup>j</sup> ) (j ∈ N), where t is a tag from a finite set Σ and each d<sup>i</sup> (0 ≤ i ≤ j) is a data value from an infinite set D.

In our case, D will have the structure of a countably infinite forest and the sequences d<sup>0</sup> ··· d<sup>j</sup> will correspond to branches of a tree. Thus, instead of d<sup>0</sup> ··· d<sup>j</sup> , we can simply write d<sup>j</sup> , because d<sup>j</sup> uniquely determines its ancestors: d0,...,d<sup>j</sup>−1. The following definition captures the technical assumptions on D.

**Definition 4.** D is a countably infinite set equipped with a function pred : D → D ∪ {⊥} (the parent function) such that the following conditions hold.


In order to define configurations of leafy automata, we will rely on finite subtrees of D, whose nodes will be labelled with states. We say that T ⊆ D is a subtree of D iff T is closed (∀x ∈ T : pred(x) ∈ T∪{⊥}) and rooted (∃!x ∈ T : pred(x) = ⊥).

Next we give the formal definition of a level-k leafy automaton. Its set of states Q will be divided into layers, written Q(i) (0 ≤ i ≤ k), which will be used to label level-i nodes. We will write Q(i1,··· ,ik) to abbreviate Q(i1) ×···× Q(ik), excluding any components Q(i<sup>j</sup> ) where i<sup>j</sup> < 0. We distinguish Q(0,−1) = {†}.

**Definition 5.** A level-k leafy automaton (k-LA) is a tuple A = Σ, k, Q, δ, where


$$\begin{array}{l} - \quad \delta \mathsf{q} = \sum\_{i=0}^{k} \delta\_{\mathsf{Q}}^{(i)}, \text{ where } \delta\_{\mathsf{Q}}^{(i)} \subseteq Q^{(0,1,\cdots,i-1)} \times \Sigma \mathsf{q} \times Q^{(0,1,\cdots,i)} \text{ for } 0 \le i \le k;\\ -\quad \delta\_{\mathsf{A}} = \sum\_{i=0}^{k} \delta\_{\mathsf{A}}^{(i)}, \text{ where } \delta\_{\mathsf{A}}^{(i)} \subseteq Q^{(0,1,\cdots,i)} \times \Sigma\_{\mathsf{A}} \times Q^{(0,1,\cdots,i-1)} \text{ for } 0 \le i \le k. \end{array}$$

Configurations of LA are of the form (D, E, f), where D is a finite subset of D (consisting of data values that have been encountered so far), E is a finite subtree of D, and f : E → Q is a level-preserving function, i.e. if d is a level-i data value then f(d) ∈ Q(i). A leafy automaton starts from the empty configuration κ<sup>0</sup> = (∅, ∅, ∅) and proceeds according to δ, making two kinds of transitions. Each kind manipulates a single leaf: for questions one new leaf is added, for answers one leaf is removed. Let the current configuration be κ = (D, E, f).

**–** On reading a letter (t, d) with t ∈ Σ<sup>Q</sup> and d ∈ D a fresh level-i data, the automaton adds a new leaf d in a configuration and updates the states on the branch to d. So it changes its configuration to κ = (D ∪ {d}, E ∪ {d}, f ) provided that pred(d) ∈ E and f satisfies:

$$(f(pred^i(d)), \dots, f(pred(d)), t, f'(pred^i(d)), \dots, f'(pred(d)), f'(d)) \in \delta\_{\mathbb{Q}}^{(i)},$$

dom(f ) = dom(f)∪{d}, and f (x) = f(x) for all x ∈ {pred(d), ···, pred<sup>i</sup> (d)}.

**–** On reading a letter (t, d) with t ∈ Σ<sup>A</sup> and d ∈ E a level-i data which is a leaf, the automaton deletes d and updates the states on the branch to d. So it changes its configuration to κ = (D, E \ {d}, f ) where f satisfies:

$$(f(pred^i(d)), \cdot, \cdot, f(pred(d)), f(d), t, f'(pred^i(d)), \cdot, \cdot, f'(pred(d))) \in \delta\_{\mathsf{A}}^{(i)},$$

dom(f ) = dom(f)\{d} and f (x) = f(x) for all x ∈ {pred(d), ··· , pred<sup>i</sup> (d)}. **–** Initially D,E, and f are empty; we proceed to κ = ({d}, {d}, {d → q(0)}) if (t, d) is read where † <sup>t</sup> −→q(0) <sup>∈</sup> <sup>δ</sup> (0) <sup>Q</sup> . The last move is treated symmetrically.

In all cases, we write κ (t,d) −−−→κ . Note that a single transition can only change states on the branch ending in d. Other parts of the tree remain unchanged.

Example 1. Below we illustrate the effect of LA transitions. Let D<sup>1</sup> = {d0, d1, d 1} and d<sup>2</sup> ∈ D1. Let κ<sup>1</sup> = (D1, E1, f1), κ<sup>2</sup> = (D<sup>1</sup> ∪ {d2}, E2, f2), κ<sup>3</sup> = (D<sup>1</sup> ∪ {d2}, E1, f1), where the trees E1, E<sup>2</sup> are displayed below and node annotations of the form (q) correspond to values of f1, f2, e.g. f1(d0) = q(0).

$$\begin{array}{ccccc} & d\_0(q^{(0)}) & & & d\_0(r^{(0)}) \\ & \searrow & \searrow & & \\ E\_1, f\_1: & d\_1'(q) & & & E\_2, f\_2: \ d\_1'(q) & \qquad & \begin{array}{c} d\_0(r^{(0)}) \\ & \searrow & \\ & & d\_1(r^{(1)}) \\ & & & \end{array} \\ & & & d\_2(r^{(2)}) \end{array}$$

For κ<sup>1</sup> to evolve into κ<sup>2</sup> (on (t, d2)), we need (q(0), q(1), t, r(0), r(1), r(2)) ∈ δ (2) <sup>Q</sup> . On the other hand, to go from κ<sup>2</sup> to κ<sup>3</sup> (on (t, d2)), we want (r(0), r(1), r(2), t, q(0), q(1)) ∈ δ (2) <sup>A</sup> .

**Definition 6.** A trace of a leafy automaton A is a sequence w = l<sup>1</sup> ··· l<sup>h</sup> ∈ (Σ × D)<sup>∗</sup> such that κ<sup>0</sup> <sup>l</sup><sup>1</sup> −→κ<sup>1</sup> ...κ<sup>h</sup>−<sup>1</sup> <sup>l</sup><sup>h</sup> −→κ<sup>h</sup> where <sup>κ</sup><sup>0</sup> = (∅, <sup>∅</sup>, <sup>∅</sup>). A configuration κ = (D, E, f) is accepting if E and f are empty. A trace w is accepted by A if there is a non-empty sequence of transitions as above with κ<sup>h</sup> accepting. The set of traces (resp. accepted traces) of A is denoted by Tr (A) (resp. L(A)).

Remark 1. When writing states, we will often use superscripts (i) to indicate the intended level. So (q(0), ··· , q(i−1)) <sup>t</sup> −→(r(0), ··· , r(i)) refers to (q(0), ··· , q(i−1), t, r(0), ··· , r(i)) ∈ δ (i) <sup>Q</sup> ; similarly for <sup>δ</sup> (i) <sup>A</sup> transitions. For <sup>i</sup> = 0, this degenerates to † <sup>t</sup> −→r(0) and <sup>r</sup>(0) <sup>t</sup> −→†.

Example 2. Consider the 1-LA over Σ<sup>Q</sup> = {start, inc}, Σ<sup>A</sup> = {dec, end}. Let Q(0) = {0}, Q(1) = {0} and define δ by: † start −−−→0, 0 <sup>i</sup>n<sup>c</sup> −−→(0, 0), (0, 0) dec −−→0, <sup>0</sup> <sup>e</sup>n<sup>d</sup> −−→†. The accepted traces of this 1-LA have the form (start, d0) (||<sup>n</sup> <sup>i</sup>=0(inc, d<sup>i</sup> 1) (dec, d<sup>i</sup> <sup>1</sup>)) (end, d0), i.e. they are valid histories of a single non-negative counter (histories such that the counter starts and ends at 0). In this case, all traces are simply prefixes of such words.

Remark 2. Note that, whenever a leafy automaton reads (t, d) (t ∈ ΣQ) and the level of d is greater than 0, then it must have read a unique question (t , pred(d)) earlier. Also, observe that an LA trace contains at most two occurrences of the same data value, such that the first is paired with a question and the second is paired with an answer. Because the question and the answer share the same data value, we can think of the answer as answering the question, like in game semantics. Indeed, justification pointers from answers to questions will be represented in this way in Theorem 3. Finally, we note that LA traces are invariant under tree automorphisms of D.

**Lemma 1.** The emptiness problem for 2-LA is undecidable. For 1-LA, it is reducible to the reachability problem for VASS in polynomial time and there is a reverse reduction in exponential time, so it is decidable in Ackermannian time [32] but not elementary [13].

Proof. For 2-LA we reduce from the halting problem on two-counter-machines. Two counters can be simulated using configurations of the form

where there are two level-1 nodes, one for each counter. The number of children at level 2 encodes the counter value. Zero tests can be implemented by removing the corresponding level-1 node and creating a new one. This is possible only when the node is a leaf, i.e., it does not have children at level 2. The state of the 2-counter machine can be maintained at level 0, the states at level 1 indicate the name of the counter, and the level-2 states are irrelevant.

The translation from 1-LA to VASS is straightforward and based on representing 1-LA configurations by the state at level 0 and, for each state at level 1, the count of its occurrences. The reverse translation is based on the same idea and extends the encoding of a non-negative counter in Example 2, where the exponential blow up is simply due to the fact that vector updates in VASS are given in binary whereas 1-LA transitions operate on single branches.

#### **Lemma 2.** 1-LA equivalence is undecidable.

Proof. We provide a direct reduction from the halting problem for 2-counter machines, where both counters are required to be zero initially as well as finally. The main obstacle is that implementing zero tests as in the proof of the first part of Lemma 1 is not available because we are restricted to leafy automata with levels 0 and 1 only. To overcome it, we exploit the power of the equivalence problem where one of the 1-LA will have the task not of correctly simulating zero tests but recognising zero tests that are incorrect. The complete argument can be found in the full paper [18, Appendix B].

### **5 Local leafy automata (LLA)**

Here we identify a restricted variant of LA for which the emptiness problem is decidable. We start with a technical definition.

**Definition 7.** A k-LA is bounded at level i (0 ≤ i ≤ k) if there is a bound b such that each node at level i can create at most b children during a run. We refer to b as the branching bound.

Note that we are defining a "global" bound on the number of children that a node at level i may create across a whole run, rather than a "local" bound on the number of children a node may have in a given configuration.

To motivate the design of LLA, we observe that the undecidability argument (for the emptiness problem) for 2-LA used two consecutive levels (0 and 1) that are not bounded. For the node at level 0, this corresponded to the number of zero tests, while an unbounded counter is simulated at level 1. In the following we will eliminate consecutive unbounded levels by introducing an alternating pattern of bounded and unbounded levels. Even-numbered layers (i = 0, 2, ...) will be bounded, while odd-numbered layers will be unbounded. Observe in particular that the root (layer 0) is bounded. As we will see later, this alternation reflects the term/context distinction in game semantics: the levels corresponding to terms are bounded, and the levels coresponding to contexts are unbounded.

With this restriction alone, it is possible to reconstruct the undecidability argument for 4-LA, as two unbounded levels may still communicate. Thus we introduce a restriction on how many levels a transition can read and modify.

**–** when adding or removing a leaf at an odd level 2i + 1, the automaton will be able to access levels 2i, 2i − 1 and 2i − 2; while

**–** when adding or removing a leaf at an even level 2i, the automaton will be able to access levels 2i − 1 and 2i − 2.

In particular, when an odd level produces a leaf, it will not be able to see the previous odd level. The above constraints mean that the transition functions δ (i) <sup>Q</sup> , δ(i) <sup>Q</sup> can be presented in a more concise form, given below.

$$\begin{aligned} \delta\_{\mathsf{Q}}^{(i)} &\subseteq \begin{cases} Q^{(i-2,i-1)} \times \Sigma\_{\mathsf{Q}} \times Q^{(i-2,i-1,i)} & \text{if } i \text{ is even} \\ Q^{(i-3,i-2,i-1)} \times \Sigma\_{\mathsf{Q}} \times Q^{(i-3,i-2,i-1,i)} & \text{if } i \text{ is odd} \end{cases} \\ \delta\_{\mathsf{A}}^{(i)} &\subseteq \begin{cases} Q^{(i-2,i-1,i)} \times \Sigma\_{\mathsf{A}} \times Q^{(i-2,i-1)} & \text{if } i \text{ is even} \\ Q^{(i-3,i-2,i-1,i)} \times \Sigma\_{\mathsf{A}} \times Q^{(i-3,i-2,i-1)} & \text{if } i \text{ is odd} \end{cases} \end{aligned}$$

In terms of the previous notation used for LA, (q(i−2), q(i−1), x, r(i−2), r(i−1), r(i)) ∈ δ (i) <sup>Q</sup> denotes all tuples of the form (q, q(i−2), q(i−1), x, q, r(i−2), r(i−1), r(i)), where q ranges over Q(0,··· ,i−3).

**Definition 8.** A level-k local leafy automaton (k-LLA) is a k-LA whose transition function admits the above-mentioned presentation and which is bounded at all even levels.

**Theorem 2.** The emptiness problem for LLA is decidable.

Proof (Sketch). Let b be a bound on the number of children created by each even node during a run.

The critical observation is that, once a node d at even level 2i has been created, all subsequent actions of descendants of d access (read and/or write) the states at levels 2i−1 and 2i−2 at most 2b times. The shape of the transition function dictates that this can happen only when child nodes at level 2i + 1 are added or removed. In addition, the locality property ensures that the automaton will never access levels < 2i − 2 at the same time as node d or its descendants.

We will make use of these facts to construct summaries for nodes on even levels which completely describe such a node's lifetime, from its creation as a leaf until its removal, and in between performing at most 2b reads-writes of the parent and grandparent states. A summary is a sequence quadruples of states: two pairs of states of levels 2i − 2 and 2i − 1. The first pair are the states we expect to find on these levels, while the second are the states to which we update these levels. Hence a summary at level 2i is a complete record of a valid sequence of read-writes and stateful changes during the lifetime of a node on level 2i.

We proceed by induction and show how to calculate the complete set of summaries at level 2i given the complete set of summaries at level 2i + 2. We construct a program for deciding whether a given sequence is a summary at level 2i. This program can be evaluated via Vector Addition Systems with States (VASS). Since we can finitely enumerate all candidate summaries at level 2i, this gives us a way to compute summaries at level 2i. Proceeding this way, we finally calculate summaries at level 2. At this stage, we can reduce the emptiness problem for the given LLA to a reachability test on a VASS.

The complete argument is given in the full paper [18, Appendix C].

Let us remark also that the problem becomes undecidable if we remove either boundedness restriction, or allow transitions to look one level further.

#### **6 From FICA to LA**

Recall from Section 3 that, to interpret base types, game semantics uses moves from the set

$$\begin{array}{l} \mathcal{M} = M\_{\mathsf{[com]}} \cup M\_{\mathsf{[exp]}} \cup M\_{\mathsf{[var]}} \cup M\_{\mathsf{[sum]}} \\ = \{\text{run, done, q, read, grb, rls, ok} \} \cup \{i, \text{wrtte}(i) \,|\, 0 \le i \le \text{max}\}. \end{array}$$

The game semantic interpretation of a term-in-context Γ M : θ is a strategy over the arena Γ θ, which is obtained through product and arrow constructions, starting from arenas corresponding to base types. As both constructions rely on the disjoint sum, the moves from Γ θ are derived from the base types present in types inside Γ and θ. To indicate the exact occurrence of a base type from which each move originates, we will annotate elements of M with a specially crafted scheme of superscripts. Suppose Γ = {x<sup>1</sup> : θ1, ··· , x<sup>l</sup> : θl}. The superscripts will have one of the two forms, where i <sup>∈</sup> <sup>N</sup><sup>∗</sup> and <sup>ρ</sup> <sup>∈</sup> <sup>N</sup>:


The annotated moves will be written as <sup>m</sup>(i,ρ) or <sup>m</sup>(xvi,ρ), where <sup>m</sup> ∈ M. We will sometimes omit ρ on the understanding that this represents ρ = 0. Similarly, when i is omitted, the intended value is . Thus, <sup>m</sup> stands for <sup>m</sup>(,0).

The next definition explains how the i superscripts are linked to moves from θ. Given <sup>X</sup> ⊆ {m(i,ρ) <sup>|</sup>i <sup>∈</sup> <sup>N</sup><sup>∗</sup>, ρ <sup>∈</sup> <sup>N</sup>} and <sup>y</sup> <sup>∈</sup> <sup>N</sup> ∪ {x1, ··· , xl}, we let yX <sup>=</sup> {m(yi,ρ) <sup>|</sup> <sup>m</sup>(i,ρ) <sup>∈</sup> <sup>X</sup>}.

**Definition 9.** Given a type θ, the corresponding alphabet T<sup>θ</sup> is defined as follows

$$\begin{array}{c} \mathcal{T}\_{\beta} = \{ \,\, m^{(\epsilon,\rho)} \,|\,\, m \in M\_{\mathbb{I}\beta}, \,\rho \in \mathbb{N} \} \qquad \beta = \mathbf{com}, \mathbf{exp}, \mathbf{var}, \text{sem} \\\mathcal{T}\_{\theta\_h \to \ldots \to \theta\_1 \to \beta} = \bigcup\_{u=1}^h (u \mathcal{T}\_{\theta\_u}) \cup \mathcal{T}\_{\beta} \end{array}$$

For Γ = {x<sup>1</sup> : θ1, ··· , x<sup>l</sup> : θl}, the alphabet T<sup>Γ</sup> <sup>θ</sup> is defined to be T<sup>Γ</sup> <sup>θ</sup> = l <sup>v</sup>=1(xvT<sup>θ</sup><sup>v</sup> ) ∪ Tθ.

Example 3. The alphabet <sup>T</sup><sup>f</sup>:**com**→**com**,x:**comco<sup>m</sup>** is {run(f1,ρ), done(f1,ρ) , run(f,ρ), done(f,ρ) ,run(x,ρ), done(x,ρ) , run(,ρ), done(,ρ) | ρ ∈ N}.

To represent the game semantics of terms-in-context, of the form Γ M : θ, we are going to use finite subsets of T<sup>Γ</sup> <sup>θ</sup> as alphabets in leafy automata. The subsets will be finite, because ρ will be bounded. Note that T<sup>θ</sup> admits a natural partitioning into questions and answers, depending on whether the underlying move is a question or answer.

We will represent plays using data words in which the underpinning sequence of tags will come from an alphabet as defined above. Superscripts and data are used to represent justification pointers. Intuitively, we represent occurrences of questions with data values. Pointers from answers to questions just refer to these values. Pointers from questions use bounded indexing with the help of ρ.

Initial question-moves do not have a pointer and to represent such questions we simply use ρ = 0. For non-initial questions, we rely on the tree structure of D and use ρ to indicate the ancestor of the currently read data value that we mean to point at. Consider a trace w(ti, di) ending in a non-initial question, where d<sup>i</sup> is a level-i data value and i > 0. In our case, we will have t<sup>i</sup> ∈ T<sup>Γ</sup> <sup>θ</sup>, i.e. t<sup>i</sup> = m(··· ,ρ). By Remark 2, trace w contains unique occurrences of questions (t0, d0), ··· ,(t<sup>i</sup>−1, d<sup>i</sup>−1) such that pred(d<sup>j</sup> ) = d<sup>j</sup>−<sup>1</sup> for j = 1, ··· , i. The pointer from (ti, di) goes to one of these questions, and we use ρ to represent the scenario in which the pointer goes to (t<sup>i</sup>−(1+ρ), d<sup>i</sup>−(1+ρ)).

Pointers from answer-moves to question-moves are represented simply by using the same data value in both moves (in this case we use ρ = 0).

We will also use -tags <sup>Q</sup> (question) and <sup>A</sup> (answer), which do not contribute moves to the represented play. Each <sup>Q</sup> will always be answered with A. Note that the use of ρ, Q, <sup>A</sup> means that several data words may represent the same play (see Examples 4, 6).

Example 4. Suppose d<sup>0</sup> = pred(d1), d<sup>1</sup> = pred(d2) = pred(d <sup>2</sup>), d<sup>2</sup> = pred(d3), and d <sup>2</sup> = pred(d 3). Then the data word (run, d0) (run<sup>f</sup> , d1) (run<sup>f</sup>1, d2) (run<sup>f</sup>1, d 2) (run(x,2), d3) (run(x,2), d <sup>3</sup>) (done<sup>x</sup>, d3), which is short for (run(,0), d0) (run(f,0), d1) (run(f1,0), d2) (run(f1,0), d 2) (run(x,2), d3) (run(x,2), d <sup>3</sup>) (done(x,0), d3), represents the play

$$
\begin{matrix}
\widetilde{\mathtt{run}\,\mathtt{run}^{f}\,\mathtt{run}^{f1}\,\mathtt{run}^{f1}\,\mathtt{run}^{x}\,\mathtt{run}^{x}\,\mathtt{run}^{x}}\,\mathtt{done}^{x} \\
O & P & O \\
\end{matrix}
\begin{matrix}
\widetilde{\mathtt{run}^{x}\,\mathtt{run}^{x}}\,\mathtt{run}^{x} & \\
O & P & P \\
\end{matrix}
$$

Example 5. Consider the LA A = Q, 3,Σ,δ, where Q(0) = {0, 1, 2}, Q(1) = {0}, Q(2) = {0, 1, 2}, Q(3) = {0}, Σ<sup>Q</sup> = {run,run<sup>f</sup> ,run<sup>f</sup>1,run(x,2)}, Σ<sup>A</sup> = {done, done<sup>f</sup> , done<sup>f</sup><sup>1</sup> , done<sup>x</sup>}, and δ is given by

$$\begin{array}{cccc} \dagger \xrightarrow{\text{run}} 0 & 0 \xrightarrow{\text{run}^f} (1,0) & (1,0) \xrightarrow{\text{done}^f} 2 & 2 \xrightarrow{\text{done}} \dagger & (1,0) \xrightarrow{\text{run}^{f1}} (1,0,0) \\ (1,0,0) \xrightarrow{\text{run}^{(x,2)}} (1,0,1,0) & (1,0,1,0) \xrightarrow{\text{done}^{(x,0)}} (1,0,2) & (1,0,2) \xrightarrow{\text{done}^{f1}} (1,0) \end{array}$$

Then traces from Tr (A) represent all plays from σ = f : **com** → **com**, x : **com** fx, including the play from Example 4, and L(A) represents comp(σ).

Example 6. One might wish to represent plays of σ from the previous Example using data values d0, d1, d 1, d <sup>1</sup> , d2, d <sup>2</sup> such that d<sup>0</sup> = pred(d1) = pred(d <sup>1</sup>) = pred(d <sup>1</sup> ), d<sup>1</sup> = pred(d2) = pred(d <sup>2</sup>), so that the play from Example 4 is represented by (run(,0), d0) (run(f,0), d1) (run(f1,0), d2) (run(f1,0), d 2) (run(x,0), d 1) (run(x,0), d <sup>1</sup> ) (done(x,0), d <sup>1</sup>). Unfortunately, it is impossible to construct a 2-LA that would accept all representations of such plays. To achieve this, the automaton would have to make sure that the number of run<sup>f</sup><sup>1</sup>s is the same as that of run<sup>x</sup>s. Because the former are labelled with level-2 values and the latter with incomparable level-1 values, the only point of communication (that could be used for comparison) is the root. However, the root cannot accommodate unbounded information, while plays of σ can feature an unbounded number of run<sup>f</sup><sup>1</sup>s, which could well be consecutive.

Before we state the main result linking FICA with leafy automata, we note some structural properties of the automata. Questions will create a leaf, and answers will remove a leaf. P-moves add leaves at odd levels (questions) and remove leaves at even levels (answers), while O-moves have the opposite effect at each level. Finally, when removing nodes at even levels we will not need to check if a node is a leaf. We call the last property even-readiness.

Even-readiness is a consequence of the WAIT condition in the game semantics. The condition captures well-nestedness of concurrent interactions – a term can terminate only after subterms terminate. In the leafy automata setting, this is captured by the requirement that only leaf nodes can be removed, i.e. a node can be removed only if all of its children have been removed beforehand. It turns out that, for P-answers only, this property will come for free. Formally, whenever the automaton arrives at a configuration κ = (D, E, f), where d ∈ E and there is a transition

$$(f(pred^{(2i)}(d)), \dots, f(pred(d)), f(d), t, f'(pred^{(2i)}(d)), \dots, f'(pred(d))) \in \delta\_{\mathbf{A}}^{(2i)},$$

then d is a leaf. In contrast, our automata will not satisfy the same property for O-answers (the environment) and for such transitions it is crucial that the automaton actually checks that only leaves can be removed.

**Theorem 3.** For any FICA-term Γ M : θ, there exists an even-ready leafy automaton A<sup>M</sup> over a finite subset of T<sup>Γ</sup> <sup>θ</sup> +{Q, A} such that the set of plays represented by data words from Tr (A<sup>M</sup> ) is exactly Γ M : θ. Moreover, L(A<sup>M</sup> ) represents comp(Γ M : θ) in the same sense.

Proof (Sketch). Because every FICA-term can be converted to βη-normal form, we use induction on the structure of such normal forms. The base cases are: <sup>Γ</sup> **skip** : **com** (Q(0) <sup>=</sup> {0}; † ru<sup>n</sup> −−→0, 0 <sup>d</sup>on<sup>e</sup> −−−→†), <sup>Γ</sup> **div** : **com** (Q(0) <sup>=</sup> {0}; † ru<sup>n</sup> −−→0), and <sup>Γ</sup> <sup>i</sup> : **exp** (Q(0) <sup>=</sup> {0}; † <sup>q</sup> −→0, 0 <sup>i</sup> −→†).

The remaining cases are inductive. When referring to the inductive hypothesis for a subterm Mi, we shall use subscripts i to refer to the automata components, e.g. Q(j) <sup>i</sup> , <sup>m</sup> −→<sup>i</sup> etc. In contrast, Q(j), <sup>m</sup> −→ will refer to the automaton that is being constructed. Inference lines will indicate that the transitions listed under the line should be added to the new automaton provided the transitions listed above the line are present in the automaton obtained via induction hypothesis. We discuss a selection of technical cases below.

Γ M1||M<sup>2</sup> In this case we need to run the automata for M<sup>1</sup> and M<sup>2</sup> concurrently. To this end, their level-0 states will be combined (Q(0) = Q(0) <sup>1</sup> <sup>×</sup>Q(0) <sup>2</sup> ), but not deeper states (Q(j) = Q(j) <sup>1</sup> <sup>+</sup> <sup>Q</sup>(j) <sup>2</sup> , <sup>1</sup> <sup>≤</sup> <sup>j</sup> <sup>≤</sup> <sup>k</sup>). The first group of transitions activate and terminate the two components respectively: † ru<sup>n</sup> −−→1q(0) <sup>1</sup> † ru<sup>n</sup> −−→2q(0) 2 † ru<sup>n</sup> −−→(q(0) <sup>1</sup> ,q(0) <sup>2</sup> ) ,

q(0) 1 done −−−→1† <sup>q</sup>(0) 2 done −−−→2† (q(0) <sup>1</sup> ,q(0) <sup>2</sup> ) <sup>d</sup>on<sup>e</sup> −−−→† . The remaining transitions advance each component: (q(0) <sup>1</sup> ,··· ,q(j) <sup>1</sup> ) <sup>m</sup> −→1(r(0) <sup>1</sup> ,··· ,r(j-) <sup>1</sup> ) <sup>q</sup>(0) <sup>2</sup> <sup>∈</sup>Q(0) 2 ((q(0) <sup>1</sup> ,q(0) <sup>2</sup> ),··· ,q(j) <sup>1</sup> ) <sup>m</sup> −→((r(0) <sup>1</sup> ,q(0) <sup>2</sup> ),··· ,r(j-) <sup>1</sup> ) q(0) <sup>1</sup> <sup>∈</sup>Q(0) <sup>1</sup> (q(0) <sup>2</sup> ,··· ,q(j) <sup>2</sup> ) <sup>m</sup> −→2(r(0) <sup>2</sup> ,··· ,r(j-) <sup>2</sup> ) ((q(0) <sup>1</sup> ,q(0) <sup>2</sup> ),··· ,q(j) <sup>2</sup> ) <sup>m</sup> −→((q(0) <sup>1</sup> ,r(0) <sup>2</sup> ),··· ,r(j-) <sup>2</sup> ) where m = run, done.

Γ **newvar** x := i **in**M<sup>1</sup> By [22], the semantics of this term is obtained from the semantics of Γ, x M1 by


To implement 1., we will lock the automaton after each read<sup>x</sup>- or write(n)<sup>x</sup>-move, so that only an answer to that move can be played next. Technically, this will be done by adding an extra bit (lock) to the level-0 state. To deal with 2., we keep track of the current value of x, also at level 0. This makes it possible to ensure that answers to read<sup>x</sup> are consistent with the stored value and that write(n)<sup>x</sup> transitions cause the right change. Erasing from condition 3 is implemented by replacing all moves with the x subscript with Q, A-tags.

Accordingly, we have Q(0) = (Q(0) <sup>1</sup> + (Q(0) <sup>1</sup> × {lock})) × {0, ··· , max } and Q(j) = Q(j) <sup>1</sup> (1 <sup>≤</sup> <sup>j</sup> <sup>≤</sup> <sup>k</sup>). As an example of a transition, we give the transition related to writing: (q(0) <sup>1</sup> ,··· ,q(j) <sup>1</sup> ) write(z)(x,ρ) −−−−−−−−→1(r(0) <sup>1</sup> ,··· ,r(j-) <sup>1</sup> ) 0≤n,z≤max ((q(0) <sup>1</sup> ,n),··· ,q(j) <sup>1</sup> ) <sup>Q</sup> −→((r(0) <sup>1</sup> ,lock,z),··· ,r(j- ) <sup>1</sup> ) .

Γ fM<sup>h</sup> ··· M<sup>1</sup> : **com** with (f : θ<sup>h</sup> → ··· → θ<sup>1</sup> → **com**) Here we will need Q(0) = {0, 1, 2}, Q(1) = {0}, Q(j+2) = h <sup>u</sup>=1 <sup>Q</sup>(j) <sup>u</sup> (0 <sup>≤</sup> <sup>j</sup> <sup>≤</sup> <sup>k</sup>). The first group of transitions corresponding to calling and returning from <sup>f</sup>: † ru<sup>n</sup> −−→0, 0 run<sup>f</sup> −−−→(1, 0), (1, 0) <sup>d</sup>one<sup>f</sup> −−−−→2, 2 <sup>d</sup>on<sup>e</sup> −−−→†. Additionally, in state (1, 0) we want to enable the environment to spawn an unbounded number of copies of each of Γ M<sup>u</sup> : θ<sup>u</sup> (1 ≤ u ≤ h). This is done through rules that embed the actions of the automata for M<sup>u</sup> while (possibly) relabelling the moves in line with our convention for representing moves from game semantics. Such transitions have the general form (q(0) <sup>u</sup> ,··· ,q(j) <sup>u</sup> ) <sup>m</sup>(t,ρ) −−−−→u(q(0) <sup>u</sup> ,··· ,q(j- ) <sup>u</sup> ) (1,0,q(0) <sup>u</sup> ,··· ,q(j) <sup>u</sup> ) <sup>m</sup>(t- ,ρ-) −−−−−→(1,0,q(0) <sup>u</sup> ,··· ,q(j- ) <sup>u</sup> ) . Note that this case also covers f : **com** (h = 0).

More details and the remaining cases are covered in the full paper [18, Appendix D], along with an example of a term and the corresponding LA.

#### **7 Local FICA**

In this section we identify a family of FICA terms that can be translated into LLA rather than LA. To achieve boundedness at even levels, we remove while<sup>5</sup>. To achieve restricted communication, we will constrain the distance between a variable declaration and its use. Note that in the translation, the application of function-type variables increases LA depth. So in LFICA we will allow the link between the binder **newvar**/**newsem**x and each use of x to "cross" at most one occurrence of a free variable. For example, the following terms

**– newvar** x := 0 **in**x := 1 || f(x := 2),

$$-\text{ }\texttt{newvar}\ x := 0 \,\,\text{in}\ f(\texttt{newvar}\ y \,\,\text{in}\ f(y:=1) \,\, | \,\, x := !y)$$

will be allowed, but not **newvar** x := 0 **in**f(f(x := 1)).

To define the fragment formally, given a term Q in βη-normal form, we use a notion of the applicative depth of a variable x : β (β = **var**, **sem**) inside Q, written adx(Q) and defined inductively by the table below. The applicative depth is increased whenever a functional identifier is applied to a term containing x.


Note that in our examples above, in the first two cases the applicative depth of x is 2; and in the third case it is 3.

**Definition 10 (Local** FICA**).** A FICA-term Γ M : θ is local if its βη-normal form does not contain any occurrences of **while** and, for every subterm of the normal form of the shape **newvar** /**newsem**x := i **in** N, we have adx(N) ≤ 2. We write LFICA for the set of local FICA terms.

**Theorem 4.** For any LFICA-term Γ M : θ, the automaton A<sup>M</sup> obtained from the translation in Theorem 3 can be presented as a LLA.

Proof (Sketch). We argue by induction that the constructions from Theorem 3 preserve presentability as a LLA.

The case of parallel composition involves running copies of M<sup>1</sup> and M<sup>2</sup> in parallel without communication, with their root states stored as a pair at level 0. Note, though, that each of the automata transitions independently of the state of the other automaton. In consequence, if the automata M<sup>1</sup> and M<sup>2</sup> are LLA, so

<sup>5</sup> The automaton for **while**M **do** N may repeatedly visit the automata for M and N, generating an unbounded number of children at level 0 in the process.

will be the automaton for M1||M2. The branching bound after the construction is the sum of the two bounds for M<sup>1</sup> and M2.

For Γ **newvar** x := i **in**M, because the term is in LFICA, so is Γ, x : **var** M and we have adx(M) ≤ 2. Then we observe that in the translation of Theorem 3 (Γ, x : **var** M : θ) the questions related to x, (namely write(i)(x,ρ) and read(x,ρ) ) correspond to creating leaves at levels 1 or 3, while the corresponding answers (ok(x,ρ) and i (x,ρ) respectively) correspond to removing such leaves. In the construction for Γ **newvar** x **in** M, such transitions need access to the root (to read/update the current state) and the root is indeed within the allowable range: in an LLA transitions creating/destroying leaves at level 3 can read/write at level 0. All other transitions (not labelled by x) proceed as in M and need not consult the root for additional information about the current state, as it is propagated. Consequently, if M is represented by a LLA then the interpretation of **newvar** x := i **in**M is also a LLA. The construction does not affect the branching bound, because the resultant runs can be viewed as a subset of runs of the automaton for M, i.e. those in which reads and writes are related.

For fM<sup>h</sup> ··· M1, we observe that the construction first creates two nodes at levels 0 and 1, and the node at level 1 is used to run an unbounded number of copies of (the automaton for) Mi. The copies do not need access to the states stored at levels 0 and 1, because they are never modified when the copies are running. Consequently, if each M<sup>i</sup> can be translated into a LLA, the outcome of the construction in Theorem 3 is also a LLA. The new branching bound is the maximum over bounds from M1, ··· , Mh, because at even levels children are produced as in M<sup>i</sup> and level 0 produces only 1 child.

**Corollary 1.** For any LFICA-term Γ M : θ, the problem of determining whether comp(Γ M) is empty is decidable.

Theorems 1 and 2 imply the above. Thanks to Theorem 1, it is decidable if a LFICA term is equivalent to a term that always diverges (cf. example on page 187). In case of inequivalence, our results could also be applied to extract the distinguishing context, first by extracting the witnessing trace from the argument underpinning Theorem 2 and then feeding it to the Definability Theorem (Theorem 41 [22]). This is a valuable property given that in the concurrent setting bugs are difficult to replicate.

### **8 From LA to FICA**

In this section, we show how to represent leafy automata in FICA. Let A = Σ, k, Q, δ be a leafy automaton. We shall assume that Σ,Q ⊆ {0, ··· , max } so that we can encode the alphabet and states using type **exp**. We will represent a trace w generated by A by a play play(w), which simulates each transition with two moves, by O and P respectively. The child-parent links in D will be represented by justification pointers. We refer the reader to [18, Appendix F] for details. Below we just state the lemma that identifies the types that correspond to our encoding, where we write θmax+1 → β for θ →···→ θ → β.

$$\sum\_{max+1}$$

**Lemma 3.** Let A be a k-LA and w ∈ Tr (A). Then play(w) is a play in θk, where θ<sup>0</sup> = **com**max+1 → **exp** and θ<sup>i</sup>+1 = (θ<sup>i</sup> → **com**)max+1 → **exp** (i ≥ 0).

Before we state the main result, we recall from [22] that strategies corresponding to FICA terms satisfy a closure condition known as saturation: swapping two adjacent moves in a play belonging to such a strategy yields another play from the same strategy, as long as the swap yields a play and it is not the case that the first move is by O and the second one by P. Thus, saturated strategies express causal dependencies of P-moves on O-moves. Consequently, one cannot expect to find a FICA-term such that the corresponding strategy is the smallest strategy containing { play(w)| w ∈ Tr (A) }. Instead, the best one can aim for is the following result.

**Theorem 5.** Given a k-LA A, there exists a FICA term M<sup>A</sup> : θ<sup>k</sup> such that M<sup>A</sup> : θk is the smallest saturated strategy containing { play(w)| w ∈ Tr (A) }.

Proof (Sketch). Our assumption Q ⊆ {0, ··· , max } allows us to maintain Astates in the memory of FICA-terms. To achieve k-fold nesting, we use the higherorder structure of the term: λf(0).f(0)(λf(1).f(1)(λf(2).f(2)(··· λf(k).f(k)))). In fact, instead of the single variables f(i), we shall use sequences f(i) <sup>0</sup> ··· <sup>f</sup>(i) max , so that a question t (i) <sup>Q</sup> read by <sup>A</sup> at level <sup>i</sup> can be simulated by using variable f(i) t (i) Q (using our assumption Σ ⊆ {0, ··· , max }). Additionally, the term contains state-manipulating code that enables moves only if they are consistent with the transition function of A.

#### **9 Conclusion and further work**

We have introduced leafy automata, LA, and shown that they correspond to the game semantics of Finitary Idealized Concurrent Algol (FICA). The automata formulation makes combinatorial challenges posed by the equivalence problem explicit. This is exemplified by a very transparent undecidability proof of the emptiness problem for LA. Our hope is that LA will allow to discover interesting fragments of FICA for which some variant of the equivalence problem is decidable. We have identified one such instance, namely local leafy automata (LLA), and a fragment of FICA that can be translated to them. The decidability of the emptiness problem for LLA implies decidability of a simple instance of the equivalence problem. This in turn allows to decide some verification questions as in the example on page 187. Since these types of questions involve quantification over all contexts, the use of a fully-abstract semantics appears essential to solve them.

The obvious line of future work is to find some other subclasses of LA with decidable emptiness problem. Another interesting target is to find an automaton model for the call-by-value setting, where answers enable questions [2,25]. It would also be worth comparing our results with abstract machines [19], the Geometry of Interaction [31], and the π-calculus [6].

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Factorization in Call-by-Name and Call-by-Value Calculi via Linear Logic**

Claudia Faggian<sup>1</sup> and Giulio Guerrieri<sup>2</sup>( )

<sup>1</sup> Universit´e de Paris, IRIF, CNRS, F-75013 Paris, France <sup>2</sup> University of Bath, Department of Computer Science, Bath, UK g.guerrieri@bath.ac.uk

**Abstract.** In each variant of the λ-calculus, factorization and normalization are two key properties that show how results are computed. Instead of proving factorization/normalization for the call-by-name (CbN) and call-by-value (CbV) variants separately, we prove them only once, for the bang calculus (an extension of the λ-calculus inspired by linear logic and subsuming CbN and CbV), and then we transfer the result via translations, obtaining factorization/normalization for CbN and CbV. The approach is robust: it still holds when extending the calculi with operators and extra rules to model some additional computational features.

### **1 Introduction**

The λ-calculus is the model of computation underlying functional programming languages and proof assistants. Actually there are many λ-calculi, depending on the *evaluation mechanism* (for instance, call-by-name and call-by-value—CbN and CbV for short) and *computational features* that the calculus aims to model.

In λ-calculi, a rewriting relation formalizes computational steps in program execution, and normal forms are the results of computations. In each calculus, a key question is to define a *normalizing strategy*: How to compute a result? Is there a reduction strategy which is guaranteed to output a result, if any exists?

Proving that a calculus admits a normalizing strategy is complex, and many techniques have been developed. A well-known method first proves *factorization* [4,32,19,2]. Given a calculus with a rewriting relation −→, a strategy →<sup>l</sup> ⊆ −→ *factorizes* if −→<sup>∗</sup> ⊆ →<sup>l</sup> <sup>∗</sup> · →¬<sup>l</sup> <sup>∗</sup> (<sup>¬</sup> →<sup>l</sup> is the dual of →<sup>l</sup> ), *i.e.* any reduction sequence can be rearranged so as to perform →<sup>l</sup> -steps first and then the other steps. If, moreover, the strategy satisfies some "good properties", we can conclude that the strategy is normalizing. Factorization is important also because it is commonly used as a building block in the proof of other properties of the *how-to-compute* kind. For instance, *standardization*, which generalizes factorization: every reduction sequences can be rearranged according to a predefined order between redexes.

*Two for One.* CbN and CbV λ-calculi are two distinct rewriting systems. Quoting from Levy [20]: *the existence of two separate paradigms* (CbN and CbV) is troubling because to prove a certain property—such as factorization or normalization for both systems *we always need to do it twice*.

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 205–225, 2021. https://doi.org/10.1007/978-3-030-71995-1 11

The *first aim* of our paper is to develop a technique for deriving factorization for both the CbN [4] and CbV [27] λ-calculi as corollaries of a *single* factorization theorem, and similarly for normalization. A key tool in our study is the *bang calculus* [11,15], a calculus inspired by linear logic in which CbN and CbV embed.

*The Bang Calculus.* The bang calculus is a variant of the λ-calculus where an operator ! plays the role of a marker for non-linear management: duplicability and discardability of resources. The bang calculus is nothing but Simpson's linear λcalculus [31] without linear abstraction, or the untyped version of the implicative fragment of Levy's Call-by-Push-Value [20], as first observed by Ehrhard [10].

The motivation to study the bang calculus is to have a general framework where both CbN and CbV λ-calculi can be simulated, via two distinct *translations* inspired by Girard's embeddings [14] of the intuitionistic arrow into linear logic. So, a certain property can be studied in the bang calculus and then automatically transferred to the CbN and CbV settings by translating back.

This approach has so far mainly be exploited semantically [21,10,11,15,9,7], but can be used it also to study operational properties [15,30,13]. In this paper, we push forward this operational direction.

*The Least-Level Strategy.* We study a strategy from the literature of linear logic [8], namely *least-level reduction* →<sup>l</sup> , which fires a redex at minimal level—the *level* of a redex is the number of ! under which the redex occurs.

We prove that the least-level reduction factorizes and normalizes in the bang calculus, and then we transfer the same results to CbN and CbV λ-calculi (for suitable definitions of least-level in CbN and CbV), by exploiting properties of their translations into the bang calculus. A single proof suffices. It is two-for-one! Or even better, three-for-one.

The rewriting study of the least level strategy in the bang calculus is based on simple techniques for factorization and normalization we developed recently with Accattoli [2], which simplify and generalize Takahashi's method [32].

*Subtleties of the Embeddings.* Transferring factorization and normalization results via translation is highly non-trivial, *e.g.* in CPS translations [27]. This applies also to transferring least-level factorization from the bang calculus to the CbN and CbV λ-calculi. To transfer the property smoothly, the translations should preserve levels and normal forms, which is delicate, in particular for CbV. For instance, the embedding of CbV into the bang calculus defined in [15,30] does not preserve levels and normal forms. As a consequence, the CbV translation studied in [15,30] cannot be used to derive least-level factorization or *any* normalization result in a CbV setting from the corresponding result in the bang calculus.

Here we adopt the refined CbV embedding of Bucciarelli *et al.* [7], which does preserve levels and normal forms. While the preservation of normal forms is already stressed in [7], the preservation of levels is proved here for the first time, and it is based on non-trivial properties of the embedding.

*Beyond pure.* Our *second aim* is to show that the developed technique for the joined factorization and normalization of CbN and CbV via the bang calculus is *robust*. We do so, by studying extensions of all three calculi with operators (or, in general, with extra rules) which model some additional computational features, such as non-deterministic or probabilistic choice. We then show that the technique scales up smoothly, under mild assumptions on the extension.

*A Motivating Example.* Let us illustrate our approach on a simple case, which we will use as a running example. De' Liguoro and Piperno's CbN non-deterministic λ-calculus Λcbn <sup>⊕</sup> [23] extends the CbN <sup>λ</sup>-calculus with an operator <sup>⊕</sup> whose reduction →<sup>⊕</sup> models *non-deterministic choice*: t ⊕ s rewrites to either t or s. It admits a standardization result, from which it follows that the leftmostoutermost reduction strategy (noted →loβ⊕) is *complete*: if t has a normal form u then <sup>t</sup> <sup>→</sup>loβ⊕<sup>∗</sup> <sup>u</sup>. In [22], de' Liguoro considers also a CbV variant <sup>Λ</sup>cbv <sup>⊕</sup> , extending Plotkin CbV λ-calculus [27] with an operator ⊕. One may prove standardization and completeness—again—from scratch, even though the proofs are similar.

The approach we propose here is to work in the bang calculus enriched with the operator ⊕. We show that the calculus satisfies *least-level factorization*, from which it follows that the least-level strategy (noted →<sup>l</sup> <sup>β</sup>!⊕) is *complete*, *i.e.* if t has a normal form u, then t →<sup>l</sup> <sup>β</sup>!⊕<sup>∗</sup> u. The translation then guarantees that analogous results hold also in Λcbn <sup>⊕</sup> and <sup>Λ</sup>cbv <sup>⊕</sup> , without proving them again.

*The Importance of Being Modular.* The bang calculus with operators is actually a general formalism for several calculi, one calculus for each kind of computational feature modeled by operators. Concretely, the reduction → consists of −→<sup>β</sup>! (which subsumes CbN −→<sup>β</sup> and CbV −→<sup>β</sup><sup>v</sup> ) and other reduction rules −→<sup>ρ</sup>.

We decompose the proof of factorization of → in modules, by using the *modular approach* we recently introduced together with Accattoli [3].

The key module is the least-level factorization of →<sup>β</sup>! , because it is where the higher-order comes into play—this is done, once and for all. Then, we consider a generic reduction rule −→<sup>ρ</sup> to add to −→<sup>β</sup>! . Our general result is that if −→<sup>ρ</sup> has "good properties" and interacts well with −→<sup>β</sup>! (which amounts to an easy test, combinatorial in nature), then we have least-level factorization for −→<sup>β</sup>! ∪ −→<sup>ρ</sup>.

Putting all together, when −→<sup>ρ</sup> is instantiated to a concrete reduction (such as →⊕), the user of our method only has to verify a simple test (namely Proposition 34), to conclude that −→<sup>β</sup>! ∪ −→<sup>ρ</sup> has least-level factorization. In particular, factorization for −→<sup>β</sup>! is a ready-to-use black box the user need not to worry about—our proof is robust enough to hold whatever the other rules are. Finally, the embeddings automatically give least-level factorization for the corresponding CbV and CbN calculi. Section 7 illustrates our method in the case −→<sup>ρ</sup> = →⊕.

*Subtleties of the Modular Extensions.* To adopt the modular approach for factorization presented in [3], we have to face an important difficulty that arises when dealing with normalizing strategies, and which is not studied in [3].

A *normalizing* strategy cannot overlook redexes and it usually selects the redex r to fire through a property that r minimizes with respect to the redexes in the whole term, such as being a *least level* redex or being the *leftmost-outermost* (shortened to LO) redex—normalizing strategies are *positional*. The problem is that, in general, if →= →<sup>β</sup> ∪ →<sup>ρ</sup>, then →l<sup>o</sup> reduction is not the union of →lo<sup>β</sup> and →lo<sup>ρ</sup>: the normalizing strategy of the compound system is not obtained putting together the normalizing strategies of the components. Let us explain the issue on our running example →<sup>β</sup>⊕, in the familiar case of leftmost-outermost reduction.

*Example 1.* Consider head reductions for →<sup>β</sup> and for →<sup>β</sup><sup>⊕</sup> =→<sup>β</sup> ∪ →⊕, noted →<sup>h</sup> <sup>β</sup> and →<sup>h</sup> <sup>β</sup>⊕, respectively. In the term s = (II)(x⊕y) where <sup>I</sup> <sup>=</sup> λx.x, the subterm II (a β-redex) is in head position for both the reduction →<sup>β</sup> and its extension →<sup>β</sup>⊕. So, s →<sup>h</sup> <sup>β</sup> <sup>I</sup>(x ⊕ y) and s →<sup>h</sup> <sup>β</sup><sup>⊕</sup> <sup>I</sup>(x ⊕ y). And in the term t = (x ⊕ y)(II), the head position is occupied by x ⊕ y, which is a ⊕-redex. Therefore, II is not the head redex in t, neither for β nor for β⊕. In general, →<sup>h</sup> <sup>β</sup><sup>⊕</sup> = →<sup>h</sup> <sup>β</sup> ∪ →<sup>h</sup> <sup>⊕</sup>.

In contrast, for leftmost-outermost reduction <sup>→</sup>loβ⊕, which reduces the <sup>l</sup>oredex, we have →loβ<sup>⊕</sup> = →lo<sup>β</sup> ∪ →lo⊕. Consider again the term t = (x ⊕ y)(II). Since <sup>x</sup> <sup>⊕</sup> <sup>y</sup> is not a <sup>β</sup>-redex, II is the <sup>l</sup>o-redex for <sup>→</sup><sup>β</sup>. Instead, II is not the <sup>l</sup>o-redex for <sup>→</sup><sup>β</sup><sup>⊕</sup> (here the <sup>l</sup>o-redex is <sup>x</sup> <sup>⊕</sup> <sup>y</sup>). So, <sup>t</sup> <sup>→</sup>lo<sup>β</sup> (<sup>x</sup> <sup>⊕</sup> <sup>y</sup>)<sup>I</sup> but <sup>t</sup> →loβ<sup>⊕</sup> (<sup>x</sup> <sup>⊕</sup> <sup>y</sup>)I.

The least-level factorization for →<sup>β</sup>! , →<sup>β</sup>, and →<sup>β</sup><sup>v</sup> we prove here is robust enough to make it ready to be used as a module in a larger proof, where it may combine with operators and other rules. The key point is to define the least-level reduction from the very beginning as a reduction firing a redex at minimal level with respect to a general set of redexes (including β!, β or βv, respectively), so that it is "ready" to be extended with other reduction rules (see Section 4).

*Proofs.* All proofs are available in [12], the long version of this paper.

### **2 Background in Abstract Rewriting**

An (*abstract*) *rewriting system*, [33, Ch. 1] is a pair (A, −→) consisting of a set A and a binary relation → ⊆ − A × A (called *reduction*) whose pairs are written t −→ s and called *steps*. A −→*-sequence* from t is a sequence of −→-steps. As usual, <sup>→</sup><sup>∗</sup> (resp. <sup>→</sup><sup>=</sup>) denotes the transitive-reflexive (resp. reflexive) closure of <sup>→</sup>. We say that u is →-*normal* (or a →-normal form) if there is no t such that u → t.

In general, a term may or may not reduce to a normal form. If it does, not all reduction sequences necessarily lead to normal form. A term is *weakly* or *strongly* normalizing, depending on if it may or must reduce to normal form. More precisely, a term t is *strongly* →*-normalizing* if *every* maximal →-sequence from t ends in a −→-normal form: any choice of →-steps will eventually lead to a normal form. A term t is *weakly* →*-normalizing* if t −→<sup>∗</sup> u for some u →-normal. If t is weakly but not strongly normalizing, how do we compute a normal form? This is the problem tackled by *normalization*: by repeatedly performing *only specific steps*, a normal form is eventually reached, provided that t can −→-reduce to any.

**Definition 2 (Normalizing and complete strategy).** *A reduction* →<sup>e</sup> ⊆ → *is a* strategy for → *if it has the same normal forms as* →*. A strategy* →<sup>e</sup> *for* → *is:*


Note that if the strategy →<sup>e</sup> is complete and *deterministic* (*i.e.* for every t ∈ A, t →<sup>e</sup> s for at most one s ∈ A), then →<sup>e</sup> is a normalizing strategy for →.

Informally, a *strategy* for −→ is a way to control the fact that in a term there are different possible choices of a −→-step. A *normalizing strategy* for → is a strategy that is guaranteed to reach a →-normal form, if it exists, from any term. This provides a useful tool to show that a term is not weakly →-normalizing.

**Proving Normalization.** Factorization means that any →-sequence from a term to another can be rearranged by performing a certain kind of steps first. It provides a simple technique to establish that a strategy is normalizing.

**Definition 3 (Factorization).** *Let* (A, −→) *be a rewriting system with* →=→<sup>e</sup> ∪ →<sup>i</sup> *. The relation* <sup>→</sup> *satisfies* <sup>e</sup>-factorization*, written* Fact(→<sup>e</sup> , <sup>→</sup><sup>i</sup> )*, if*

> Fact(→<sup>e</sup> , <sup>→</sup><sup>i</sup> ): (→<sup>e</sup> ∪ →<sup>i</sup> ) <sup>∗</sup> ⊆ →<sup>e</sup> <sup>∗</sup> · →<sup>i</sup> <sup>∗</sup> (**Factorization**)

**Lemma 4 (Normalization [2]).** *Let* <sup>→</sup><sup>=</sup> <sup>→</sup><sup>e</sup> ∪ →¬<sup>e</sup> *, and* <sup>→</sup><sup>e</sup> *be a strategy for* <sup>→</sup>*. The strategy* →<sup>e</sup> *is* complete *for* → *if the following conditions hold:*


*The strategy* →<sup>e</sup> *is* normalizing *for* → *if it is complete and the following holds:*

*3. (* uniformity*) every weakly* →<sup>e</sup> *-normalizing term is strongly* →<sup>e</sup> *-normalizing.*

A sufficient condition for uniformity (and confluence) is the quasi-diamond.

**Property 5 (Newman [25])** *If a reduction* → *is* quasi-diamond *(i.e.* s←t→r *implies* s = r *or* s → u ← r *for some* u*), then* → *is uniform and confluent (i.e.* s <sup>∗</sup>← r →<sup>∗</sup> t *implies* s →<sup>∗</sup> u <sup>∗</sup>← t *for some* u*).*

**Proving Factorization.** Hindley [17] first noted that a local property implies factorization. Let <sup>→</sup> <sup>=</sup> <sup>→</sup><sup>e</sup> ∪ →<sup>i</sup> . We say that <sup>→</sup><sup>i</sup> *strongly postpones* after <sup>→</sup><sup>e</sup> if

SP(→<sup>e</sup> , <sup>→</sup><sup>i</sup> ) : <sup>→</sup><sup>i</sup> · →<sup>e</sup> ⊆ →<sup>e</sup> <sup>∗</sup> · →<sup>i</sup> <sup>=</sup> (**Strong Postponement**)

**Lemma 6 (Hindley [17]).** SP(→<sup>e</sup> , <sup>→</sup><sup>i</sup> ) *implies* Fact(→<sup>e</sup> , <sup>→</sup><sup>i</sup> )*.*

Strong postponement can rarely be used *directly*, because several interesting reductions—including β-reduction—do not satisfy it. However, it is at the heart of Takahashi's method [32] to prove head factorization of −→<sup>β</sup>, via the following immediate property that can also be used to prove other factorizations (see [2]).

**Property 7 (Characterization of factorization)** *We have* Fact(→<sup>e</sup> , <sup>→</sup><sup>i</sup> ) *if and only if there is a reduction* ◦→<sup>i</sup> *such that* ◦→<sup>i</sup> <sup>∗</sup> <sup>=</sup> <sup>→</sup><sup>i</sup> <sup>∗</sup> *and* SP(→<sup>e</sup> , ◦→<sup>i</sup> )*.*

The core of Takahashi's method [32] to prove head factorization in the λcalculus is to introduce a relation <sup>⇒</sup><sup>i</sup> , called *internal parallel reduction*, which verifies the conditions of Property 7. We will follow a similar path in Section 6.1, to prove *least-level* factorization in the bang calculus.

**Compound systems: proving factorization in a modular way.** In this paper, we will consider compound rewriting systems that are obtained by extending the λ-calculus with extra rules to model advanced computational features.

In an abstract setting, let us consider a rewrite system (A, →) where →= →<sup>ξ</sup> ∪ →<sup>ρ</sup>. Under which condition → admits factorization, assuming that both →<sup>ξ</sup> and →<sup>ρ</sup> do? To deal with this question, a technique for proving factorization for *compound systems* in a *modular* way has been introduced in [3]. The approach can be seen as an analogous for factorization of the classical technique for confluence based on Hindley-Rosen lemma [4]: if →<sup>ξ</sup>, →<sup>ρ</sup> are e-factorizing reductions, their union →<sup>ξ</sup> ∪ →<sup>ρ</sup> also is, provided that two *local* conditions of commutation hold.

**Lemma 8 (Modular factorization [3]).** *Let* <sup>→</sup><sup>ξ</sup> =→<sup>e</sup> <sup>ξ</sup> ∪ →<sup>i</sup> <sup>ξ</sup> *and* <sup>→</sup><sup>ρ</sup> =→<sup>e</sup> <sup>ρ</sup> ∪ →<sup>i</sup> <sup>ρ</sup> *be* <sup>e</sup>*-factorizing relations. Let* <sup>→</sup><sup>e</sup> :=→<sup>e</sup> <sup>ξ</sup> ∪ →<sup>e</sup> <sup>ρ</sup> *and* <sup>→</sup><sup>i</sup> :=→<sup>i</sup> <sup>ξ</sup> ∪ →<sup>i</sup> <sup>ρ</sup>*. The reduction* <sup>→</sup><sup>ξ</sup> ∪ →<sup>ρ</sup> *fulfills factorization* Fact(→<sup>e</sup> , <sup>→</sup><sup>i</sup> ) *if the following swaps hold:*

$$\begin{array}{ccc} \underset{\text{i}}{\rightleftharpoons}\boldsymbol{\xi}\cdot\underset{\mathbf{e}}{\rightleftharpoons}\subseteq\underset{\mathbf{e}}{\rightleftharpoons}\boldsymbol{\xi}\cdot\underset{\boldsymbol{\xi}}{\rightleftharpoons}\quad\quad\text{and}\quad\underset{\text{i}}{\rightleftharpoons}\boldsymbol{\rho}\cdot\underset{\mathbf{e}}{\rightleftharpoons}\subseteq\underset{\mathbf{e}}{\rightleftharpoons}\boldsymbol{\xi}\cdot\underset{\boldsymbol{\rho}}{\rightleftharpoons}\quad\quad\text{(Linear Swaps)}\\\end{array}$$

The subtlety here is to set →<sup>e</sup> <sup>ξ</sup> and →<sup>e</sup> <sup>ρ</sup> so that →<sup>e</sup> = →<sup>e</sup> <sup>ξ</sup> ∪→<sup>e</sup> <sup>ρ</sup>. As already shown in Example 1, when dealing with normalizing strategies one needs extra care.

### **3** *λ***-calculi: CbN, CbV, and bang**

We present here a generic syntax for λ-calculi, possibly containing operators. All the variants of the λ-calculus we shall study use this language. We assume some familiarity with the λ-calculus, and refer to [4,18] for details.

Given a countable set Var of variables, denoted by x, y, z, . . . , *terms* and *values* (whose sets are denoted by Λ<sup>O</sup> and Val, respectively) are defined as follows:

$$\{t, s, r ::= v \mid ts \mid \mathbf{o}(t\_1, \ldots, t\_k) \quad Terms \colon \boldsymbol{\Lambda}\_{\mathcal{O}} \qquad \qquad v ::= x \mid \lambda x.t \quad Values \colon \mathbf{Val}$$

where **o** ranges over a set O of function symbols called *operators*, each one with its own arity <sup>k</sup> <sup>∈</sup> <sup>N</sup>. If the operators are **<sup>o</sup>**1,..., **<sup>o</sup>**n, the set of terms is indicated as Λ**<sup>o</sup>**1...**o**<sup>n</sup> . When the set O of operators is empty, the calculus is called *pure*, and the sets of terms is denoted by Λ; otherwise, the calculus is *applied*.

Terms are identified up to renaming of bound variables, where abstraction is the only binder. We denote by t{s/x} the capture-avoiding substitution of s for the free occurrences of x in t. *Contexts* (with exactly one hole ·) are generated by the grammar below, and **c**t stands for the term obtained from the context **c** by replacing the hole with the term t (possibly capturing free variables).

$$\mathbf{c} ::= \langle \cdot \rangle \mid t\mathbf{c} \mid \mathbf{c}t \mid \lambda x.\mathbf{c} \mid \mathbf{o}(t\_1, \dots, \mathbf{c}, \dots, t\_k) \qquad \text{Convert: } \mathcal{C}$$

A *rule* ρ is a binary relation on ΛO; we also call it ρ*-rule* and denote it by →<sup>ρ</sup>, writing t →<sup>ρ</sup> t rather than (t, t ) ∈ ρ. The ρ-*reduction* −→<sup>ρ</sup> is the contextual closure of ρ. Explicitly, t −→<sup>ρ</sup> t holds if t = **c**r and t = **c**r  for some context **c** with r →<sup>ρ</sup> r ; the term r is called a ρ-*redex*. The set of ρ-*redexes* is denoted by R<sup>ρ</sup>.

Given a set of rules Rules, the relation → = - <sup>ρ</sup> −→<sup>ρ</sup> (for ρ ∈ Rules) can equivalently be defined as the contextual closure of → = - <sup>ρ</sup> →<sup>ρ</sup>.

#### **3.1 Call-by-Name and Call-by-Value** *λ***-calculi**

*Pure CbN and Pure CbV* λ*-calculi.* The *pure call-by-name* (CbN for short) λcalculus [4,18] is (Λ, →<sup>β</sup>), the set of terms Λ together with the β-reduction →<sup>β</sup>, defined as the contextual closure of the usual β-rule, which we recall in (1) below.

The *pure call-by-value* (CbV for short) λ-calculus [27] is the set Λ endowed with the reduction −→<sup>β</sup><sup>v</sup> , defined as the contextual closure of the βv-rule in (2).

$$\text{CbN: } (\lambda x.t)s \mapsto\_{\beta} t \{s/x\} \text{ (1) }\qquad \text{CbN: } (\lambda x.t)v \mapsto\_{\beta\_v} t \{v/x\} \text{ with } v \in \mathsf{Val} \text{ (2)}$$

*CbN and CbV* λ*-calculi.* A *CbN* (resp. *CbV* ) λ*-calculus* is the set of terms endowed with a reduction −→ which extends →<sup>β</sup> (resp. →<sup>β</sup><sup>v</sup> ).

In particular, the *applied* setting with operators (when O = ∅) models in the λ-calculus richer computational features, allowing **o**-reductions as the contextual closure of **o**-rules of the form **o**(t1,...,tk) →**<sup>o</sup>** s.

*Example 9 (Non-deterministic* λ*-calculi).* Let O = {⊕} where ⊕ is a binary operator; let →<sup>⊕</sup> be the contextual closure of the (non-deterministic) rule below:

$$t \oplus (t\_1, t\_2) \mapsto\_{\oplus} t\_1 \qquad \text{and} \qquad \oplus (t\_1, t\_2) \mapsto\_{\oplus} t\_2.$$

The *non-deterministic CbN* λ*-calculus* Λcbn <sup>⊕</sup> = (Λ⊕, <sup>→</sup><sup>β</sup>⊕) is the set <sup>Λ</sup><sup>⊕</sup> with the reduction →<sup>β</sup><sup>⊕</sup> = −→<sup>β</sup> ∪ →⊕. The *non-deterministic CbV* λ*-calculus* Λcbv <sup>⊕</sup> = (Λ⊕, <sup>→</sup><sup>β</sup>v⊕) is the set <sup>Λ</sup><sup>⊕</sup> with the reduction <sup>→</sup><sup>β</sup>v<sup>⊕</sup> <sup>=</sup> −→<sup>β</sup><sup>v</sup> ∪ →⊕.

#### **3.2 Bang calculi**

The bang calculus [11,15] is a variant of the λ-calculus inspired by linear logic. An operator ! plays the role of a marker for duplicability and discardability. Here

we allow also the presence of operators other than !, ranging over a set O. So, terms and contexts of the bang calculus (denoted by capital letters) are:

$$\begin{aligned} T, S, R &::= x \mid \lambda x. T \mid TS \mid !T \mid \mathbf{o}(T\_1, \ldots, T\_k) & &Terms \colon \Lambda\_{!\mathcal{O}}\\ \mathbf{C} &::= \langle \cdot \rangle \mid \lambda x. \mathbf{C} \mid T\mathbf{C} \mid \mathbf{C}T \mid !\mathbf{C} \mid \mathbf{o}(T\_1, \ldots, \mathbf{C}, \ldots, T\_k) & & Conxts \colon \mathcal{C}\_{!} \end{aligned}$$

Terms of the form !T are called *boxes* and their set is denoted by !Λ!O. When there are no operators other than ! (*i.e.* O = ∅), the set of terms and the set of boxes are denoted by Λ! and !Λ!, respectively. This syntax can be expressed in the one at the beginning of Section 3, where ! is an unary operator called *bang*.

*The pure bang calculus.* The *pure* bang calculus (Λ!, →<sup>β</sup>! ) is the set of terms Λ! endowed with reduction −→<sup>β</sup>! , the closure under contexts in C! of the β!*-rule*:

$$(\lambda x.T)!S \mapsto\_{\beta!} T\{S/x\} \tag{3}$$

Intuitively, in the bang calculus the bang-operator ! marks the only terms that can be erased and duplicated. Indeed, a β*-like redex* (λx.T)S can be fired by →<sup>β</sup>! only when its argument S is a box, *i.e.* S = !R: if it is so, the content R of the box S (and not S itself) replaces any free occurrence of x in T. 3

A proof of confluence of β!-reduction −→<sup>β</sup>! is in [15].

**Notation 10** *We use the following notations to denote some notable terms.*

ι := λx.x δ := λx.xx I := λx.!x Δ := λx.x !x.

*Remark 11 (Notable terms).* The term I = λx.!x plays the role of the identity in the bang calculus: I !T −→<sup>β</sup>! !(x{T /x})=!T for any term T. Instead, the term ι = λx.x, when applied to a box !T, opens the box, *i.e.* returns its content T: ι!T −→<sup>β</sup>! x{T /x} = T. Finally, Δ !Δ −→<sup>β</sup>! Δ !Δ −→<sup>β</sup>! ... is a diverging term.

*A bang calculus.* A *bang calculus* (Λ!O, −→) is the set Λ!<sup>O</sup> of terms endowed with a reduction −→ which extends −→<sup>β</sup>! . In this paper we shall consider calculi where −→ contains −→<sup>β</sup>! and **o**-reductions −→**<sup>o</sup>** (**o** ∈ O) defined from **o**-rules of the form **o**(T1,...,Tk) →**<sup>o</sup>** S, and possibly other rules. So, →= - <sup>ρ</sup> −→<sup>ρ</sup> (for ρ ∈ Rules), with Rules ⊇ {!β, **o** | **o** ∈ O}. We set →<sup>O</sup> = - **<sup>o</sup>**∈O <sup>→</sup>**<sup>o</sup>**.

#### **3.3 CbN and CbV translations into the bang calculus**

Our motivation to study the bang calculus is to have a general framework where both CbN [4] and CbV [27] λ-calculi can be embedded, via two distinct translations. Here we show how these translations work. We extend the simulation results in [15,30,7] for the pure case to the case with operators (Proposition 13).

Following [7], the CbV translation defined here differs from [15,30] in the application case. Section 5 will show why this optimization is crucial.

*CbN* and *CbV translations* are two maps (·)<sup>n</sup> : <sup>Λ</sup><sup>O</sup> −→ <sup>Λ</sup>!<sup>O</sup> and (·)<sup>v</sup> : <sup>Λ</sup><sup>O</sup> −→ <sup>Λ</sup>!O, respectively, translating terms of the λ-calculus into terms of the bang calculus:

<sup>3</sup> Syntax and reduction rule of the bang calculus follow [15], which is slightly different from [11]. Unlike [15] (but akin to [30,16]), here we do not use ι (aka der) as a primitive, since ι and its associated rule -→<sup>d</sup> can be simulated, see Remark 11 and (4).

$$x^{n} = x \quad (\lambda x. \, t)^{n} = \lambda x. \, t^{n} \qquad (\mathbf{o}(t\_{1}, \dots, t\_{k}))^{n} = \mathbf{o}(t\_{1}^{n}, \dots, t\_{k}^{n}) \quad (ts)^{n} = t^{n}!s^{n};$$

$$x^{\nu} = !x \quad (\lambda x. \, t)^{\nu} = !(\lambda x. t^{\nu}) \quad (\mathbf{o}(t\_{1}, \dots, t\_{k}))^{\nu} = \mathbf{o}(t\_{1}^{\nu}, \dots, t\_{k}^{\nu}) \quad (ts)^{\nu} = \begin{cases} T \, s^{\nu} & \text{if } t^{\nu} = !T \\ (\iota \, t^{\nu}) s^{\nu} & \text{otherwise.} \end{cases}$$

*Example 12.* Consider the λ-term ω := δδ: then, δ<sup>n</sup> = Δ, δ<sup>v</sup> = !Δ and ω<sup>n</sup> = Δ !Δ = ω<sup>v</sup> (δ and Δ are defined in Notation 10). The λ-term ω is diverging in CbN and CbV λ-calculi, and so is ω<sup>n</sup> = ω<sup>v</sup> in the bang calculus, see Remark 11.

For any term t ∈ ΛO, t <sup>n</sup> and t <sup>v</sup> are just different decorations of t by means of the bang-operator ! (recall that <sup>ι</sup> <sup>=</sup> λx.x). The translation (·)<sup>n</sup> puts the argument of any application into a box: in CbN any term is duplicable or discardable. On the other hand, only *values* (*i.e.* abstractions and variables) are translated by (·)<sup>v</sup> into boxes, as they are the only terms duplicable or discardable in CbV.

As in [15,30], we prove that the CbN translation (·)<sup>n</sup> (resp. CbV translation (·)<sup>v</sup>) from the pure CbN (resp. CbV) <sup>λ</sup>-calculus into the bang calculus is *sound* and *complete*: it maps β-reductions (resp. βv-reductions) of the λ-calculus into β! reductions of the bang calculus, and conversely β!-reductions — when restricted to the image of the translation — into β-reductions (resp. βv-reductions). The same holds if we consider any **o**-reduction for operators, where we assume that the **o**-rule commutes with the translations: if **o**(t1,...,tk) →**<sup>o</sup>** s then **o**(t n 1,...,t<sup>n</sup> <sup>k</sup>) <sup>→</sup>**<sup>o</sup>** <sup>s</sup><sup>n</sup>, and if **o**(t n 1,...,t<sup>n</sup> <sup>k</sup>) <sup>→</sup>**<sup>o</sup>** <sup>S</sup> then **<sup>o</sup>**(t1,...,tk) <sup>→</sup>**<sup>o</sup>** <sup>s</sup> with <sup>s</sup><sup>n</sup> <sup>=</sup> <sup>S</sup>; similarly for (·)<sup>v</sup>.

In the simulation, −→<sup>d</sup> denotes the contextual closure of the rule:

$$
\iota \lrcorner T \mapsto\_{\sf d} T \quad \text{(this is nothing but } (\lambda x.x)! T \mapsto\_{\beta \square} T\text{)}\tag{4}
$$

Clearly, −→<sup>d</sup> ⊆ −→<sup>β</sup>! (Remark 11). We write <sup>T</sup> <sup>d</sup> S if T −→<sup>∗</sup> <sup>d</sup> S and S is d-normal.

### **Proposition 13 (Simulation of CbN and CbV).** *Let* t ∈ Λ<sup>O</sup> *and* **o** ∈ O*.*


*Example 14.* Let t = (λz.z)x y and t = xy. So t −→<sup>β</sup> t with t <sup>n</sup> = (λz.z)!<sup>x</sup> !<sup>y</sup> −→<sup>β</sup>! x !y = t n ; and t −→<sup>β</sup><sup>v</sup> t with t <sup>v</sup> = (ι((λz.!z)!x))!<sup>y</sup> −→<sup>β</sup>! (ι!x)!<sup>y</sup> −→<sup>d</sup> <sup>x</sup> !<sup>y</sup> <sup>=</sup> <sup>t</sup> v .

#### **4 The least-level strategy**

The bang calculus Λ! has a natural normalizing strategy, derived from linear logic [8], namely the *least-level reduction*. It reduces only redexes at *least level*, where the *level* of a redex R in a term T is the number of bangs ! in which R is nested.

Least-level reduction is easily extended to a general bang calculus (Λ!O, −→). The level of a redex R is then the number of bangs ! and operators **o** in which R is nested; intuitively, least-level reduction fires a redex which is *minimally nested*.

Below, we formalize the reduction in a way that is independent of the specific shape of the redexes, and even of specific definition of level one chooses. The interest of least-level reduction is in the properties it satisfies. All our developments will rely on such properties, rather than the specific definition of least level.

In this section, → = - <sup>ρ</sup> −→<sup>ρ</sup> for ρ ∈ Rules (for a generic set of rules Rules). We write R = - <sup>ρ</sup> R<sup>ρ</sup> (again, with ρ ∈ Rules) for the set of *all* redexes.

#### **4.1 Least-level reduction in bang calculi**

The *level* of a redex occurrence R in a term T is a measure of its depth. Formally, we indicate the *occurrence of a subterm* R in T with the context **C** such that **C**R = T. Its level is then the *level* (**C**) <sup>∈</sup> <sup>N</sup> of the hole in **<sup>C</sup>**. The definition of *level* for contexts in a bang calculus Λ!<sup>O</sup> is formalized as follows.

$$\begin{aligned} \ell(\langle \cdot \rangle) &= 0 & \ell(\lambda x. \mathbf{C}) &= \ell(\mathbf{C}) & \ell(\mathbf{C}T) &= \ell(\mathbf{C}) & \ell(T\mathbf{C}) &= \ell(\mathbf{C})\\ \ell(!\mathbf{C}) &= \ell(\mathbf{C}) + 1 & \ell(\mathbf{o}(\dots, \mathbf{C}, \dots)) &= \ell(\mathbf{C}) + 1 \end{aligned} \tag{5}$$

Note that the level increases by 1 in the scope of !, and of any operator **o** ∈ O.

A reduction step <sup>T</sup> −→<sup>ρ</sup> <sup>S</sup> is *at level* <sup>k</sup> if it fires a <sup>ρ</sup>-redex at level <sup>k</sup> <sup>∈</sup> <sup>N</sup>; it is *least-level* if it reduces a redex whose level is minimal.

The *least level* (T) of a term T expresses the minimal level of any redex occurrences in T; if no redex is in T, we set (T) = ∞. Formally:

**Definition 15 (Least-level reduction).** *Let* →= - <sup>ρ</sup> →<sup>ρ</sup> *(for* ρ ∈ Rules*) and* R = - <sup>ρ</sup> R<sup>ρ</sup> *the set of redexes. Given a function* (·) *from contexts to* <sup>N</sup>*:*

**–** *The* least level *of a term* T *is defined as*<sup>4</sup>

$$\ell\ell(T) := \inf\{\ell(\mathbf{C}) \mid T = \mathbf{C}\langle R \rangle \text{ for some } R \in \mathcal{R}\} \in (\mathbb{N} \cup \{\infty\}).\tag{6}$$

	- *1.* at level k*, noted* T −→<sup>ρ</sup>:<sup>k</sup> S*, if* T = **C**R*,* S = **C**R *,* R →<sup>ρ</sup> R *,* (**C**) = k*;*
	- *2.* least-level*, noted* T →<sup>l</sup> <sup>ρ</sup> S*, if* T −→<sup>ρ</sup>:<sup>k</sup> S *and* k = (T)*;*
	- *3.* internal*, noted* <sup>T</sup> <sup>¬</sup> →<sup>l</sup> <sup>ρ</sup> S*, if* T −→<sup>ρ</sup>:<sup>k</sup> S *and* k > (T)*.*

Note that <sup>→</sup> <sup>=</sup> <sup>→</sup><sup>l</sup> ∪ →¬<sup>l</sup> and that our definitions solve the issue of Example 1. Indeed, the definition of least level (T) of a term, and hence the definition of →<sup>l</sup> <sup>ρ</sup>, depend on the *whole* set R = - <sup>ρ</sup> R<sup>ρ</sup> of redexes associated with →. 5

<sup>4</sup> Recall that inf <sup>∅</sup> <sup>=</sup> <sup>∞</sup>, when <sup>∅</sup> is seen as the empty subset of <sup>N</sup> with the usual order. <sup>5</sup> We should write R(T), <sup>l</sup><sup>R</sup> and <sup>l</sup>

<sup>→</sup> R<sup>ρ</sup>, but we avoid it for the sake of readability.

*Normal Forms.* It is immediate that <sup>→</sup><sup>l</sup> <sup>→</sup> is a *strategy* for <sup>→</sup>. Indeed, <sup>→</sup><sup>l</sup> and → have the *same normal forms* because →<sup>l</sup> ⊆ → and if a term has a →-redex, it has a redex at least-level, *i.e.* it has a →<sup>l</sup> -redex.

*Remark 16 (Least level of normal forms).* Note that (T) = ∞ if and only if T is →-normal, because (**C**) <sup>∈</sup> <sup>N</sup> for all contexts **<sup>C</sup>**.

*A good least-level reduction.* The beauty of least-level reduction for the bang calculus, is that it satisfies some elegant properties, which allow for neat proofs, in particular monotonicity and internal invariance (in Definition 17). The developments in the rest of the paper rely on such properties, and in fact will apply to any calculus whose reduction → has the properties described below.

**Definition 17 (Good least-level).** *A reduction* → *has a* good least-level *if:*


Point 1 states that no step can decrease the least level of a term. Point 2 says that internal steps cannot change the least level of a term. Therefore, only leastlevel steps may increase the least level. Together, they imply persistence: only least-level steps can approach normal forms.

**Property 18 (Persistence)** *If* <sup>→</sup> *has a good least-level, then* <sup>T</sup> <sup>¬</sup> →<sup>l</sup> S *implies that* S *is not* →*-normal.*

Reduction →<sup>β</sup>! in the pure bang calculus (Λ!, →<sup>β</sup>! ) has a good least-level. More in general, the same holds when extending the reduction with operators.

**Proposition 19 (Good least-level of bang calculi).** *Given* Λ!O*, let* →= →<sup>β</sup>! ∪ →O*, where each* **o** ∈ O *has a redex of shape* **o**(P1,...,Pk)*. The reduction* → *has a good least-level.*

#### **4.2 Least-level for a bang calculus: examples.**

Let us see more closely the least-level reduction for a bang calculus (Λ!O, −→). For concreteness, we consider Rules = {β!, **o** | **o** ∈ O}, hence the set of redexes is R = R<sup>β</sup>! ∪ RO, where R<sup>O</sup> is the set of terms **o**(T1,...,Tk) for any **o** ∈ O.

We observe that the least level (T) of a term T ∈ Λ!<sup>O</sup> can be easily defined in a direct way, by induction on T:

$$\begin{array}{l} - \; \ell \ell (T) = 0 \text{ if } T \in \mathcal{R} = \mathcal{R}\_{\beta \downarrow} \cup \mathcal{R}\_{\mathcal{O}},\\ - \; \text{otherwise}, \; \ell \ell (x) = \infty \text{ and} \end{array}$$

$$\ell\ell(\lambda x.T) = \ell\ell(T) \qquad \ell\ell(!T) = \ell\ell(T) + 1 \qquad \ell\ell(TS) = \min\{\ell\ell(T), \ell\}$$

*Example 20 (Least level of a term).* Let R ∈ R<sup>β</sup>! . If T<sup>0</sup> := R !R, then (T0) = 0. If T<sup>1</sup> := x !R then (T1) = 1. If T<sup>2</sup> := **o**(x, y)!R then (T2) = 0, as **o**(x, y) ∈ RO.

 (S)}.

Intuitively, least-level reduction fires a redex that is *minimally nested*, where a redex is any subterm whose form is in R = R<sup>β</sup>! ∪ RO. Note that least-level reduction can choose to fire one among possibly *several* redexes at minimal level.

*Example 21.* Let us revisit Example 20 with R = ι!z ∈ R<sup>β</sup>! (so R →<sup>β</sup>! z, see Remark 11). Then T<sup>1</sup> := x !R →<sup>l</sup> <sup>β</sup>! x !z but T<sup>0</sup> := R !R →<sup>l</sup> <sup>β</sup>! R !z and T<sup>2</sup> := **o**(x, y) !R →<sup>l</sup> <sup>β</sup>! **o**(x, y)!z. Also, **o**(x, R) →<sup>l</sup> <sup>β</sup>! **o**(x, z) although **o**(x, R) →<sup>β</sup>! **o**(x, z).

Let S = ι!(z !z) (so S →<sup>β</sup>! z !z). In (λz.S)!S, two least-level steps are possible (the fired β!-redex is underlined): (λz.S)!S →<sup>l</sup> <sup>β</sup>! ι!(S !S), and (λz.S)!S →<sup>l</sup> <sup>β</sup>! (λz.z !z)!S. But (λz.S)!S →<sup>l</sup> <sup>β</sup>! (λz.S)!(z !z) although (λz.S)!S →<sup>β</sup>! (λz.S)!(z !z).

#### **4.3 Least-level for CbN and CbV** *λ***-calculi**

The definition of least-level reduction in Section 4.1 is independent of the specific notion of level chosen, and of the specific calculus. The idea is that the reduction strategy persistently fires a redex at minimal level, once such a notion is set.

Least-level reduction can indeed be defined also for the CbN and CbV λcalculi, given an opportune definition of level. In CbN, we count the number of nested arguments and operators containing the redex occurrence. In CbV, we count the number of nested operators and *unapplied* abstractions containing the redex occurrence, where an abstraction is unapplied if it is not the right-hand side of an application. Formally, a redex occurrence is identified by a context (as explained in Section 4.1), and we define the *level* CbN(**c**) <sup>∈</sup> <sup>N</sup> and CbV(**c**) <sup>∈</sup> <sup>N</sup> of a context **c** in CbN and CbV λ-calculi, respectively, as follows.

$$\begin{split} \ell^{\text{CbN}}(\langle\cdot\rangle) &= 0 \\ \ell^{\text{CbN}}(\lambda x.\mathbf{c}) &= \ell^{\text{CbN}}(\mathbf{c}) \\ \ell^{\text{CbN}}(\mathbf{c}t) &= \ell^{\text{CbN}}(\mathbf{c}) \\ \ell^{\text{CbN}}(t\mathbf{c}) &= \ell^{\text{CbN}}(\mathbf{c}) \\ \ell^{\text{CbN}}(t\mathbf{c}) &= \ell^{\text{CbN}}(\mathbf{c}) + 1 \\ \ell^{\text{CbN}}(\mathbf{c}(\dots,\mathbf{c},\dots)) &= \ell^{\text{CbN}}(\mathbf{c}) + 1 \end{split} \qquad \begin{split} \ell^{\text{CbV}}(\langle\cdot\rangle) &= 0 \\ \ell^{\text{CbV}}(\mathbf{c}) &= \ell^{\text{CbV}}(\mathbf{c}) \end{split}$$
 
$$\ell^{\text{CbN}}(t\mathbf{c}) = \ell^{\text{CbN}}(\mathbf{c}) + 1 \qquad \begin{split} \ell^{\text{CbV}}(\mathbf{c}) &= \lambda^{\text{CbV}}(\mathbf{c}) \\ \ell^{\text{CbV}}(\mathbf{c}) &= \ell^{\text{CbV}}(\mathbf{c}) \end{split}$$
 
$$\ell^{\text{CbN}}(\mathbf{o}(\dots,\mathbf{c},\dots)) = \ell^{\text{CbN}}(\mathbf{c}) + 1 \qquad \ell^{\text{CbV}}(\mathbf{o}(\dots,\mathbf{c},\dots)) = \ell^{\text{CbV}}(\mathbf{c}) + 1.$$

In both CbN and CbV λ-calculi, the *least level* of a term (denoted by CbN(·) and CbV(·)) and *least-level* and *internal* reductions are given by Definition <sup>15</sup> (replace (·) with CbN(·) for CbN, and with CbV(·) for CbV).

In Section 5 we will see that the definitions of CbN and CbV least level are not arbitrary, but induced by the CbN and CbV translations defined in Section 3.3.

#### **5 Embedding of CbN and CbV by level**

Here we refine the analysis of the CbN and CbV translations given in Section 3.3, by showing two new results: translations preserve normal forms (Proposition 22) and least-level (Proposition 25), back and forth. This way, to obtain least-level

*factorization* or least-level *normalization* results, it suffices to prove them in the bang calculus. The translation transfers the results into the CbN and CbV λ-calculi (Theorem 26). We use here the expression "translate" in a strong sense: the results for CbN and CbV λ-calculi are obtained from the corresponding results in the bang calculus almost for free, just via CbN and CbV translations.

*Preservation of normal forms.* The targets of the CbN translation (·)<sup>n</sup> and CbV translation (·)<sup>v</sup> into the bang calculus can be *characterized syntactically*. A fine analysis of these fragments of the bang calculus (see [12] for details) proves that both CbN and CbV translations preserve normal forms, back and forth.

### **Proposition 22 (Preservation of normal forms).** *Let* t, s ∈ Λ<sup>O</sup> *and* **o** ∈ O*.*


By Remark 16, Proposition 22 can be seen as the fact that CbN and CbV translations preserve the least-level of a term, back and forth, when the least-level is infinite. Actually, this holds more in general for any value of the least-level.

*Preservation of levels.* We aim to show that least-level steps in CbN and CbV λ-calculi correspond to least-level steps in the bang calculus—back and forth—via CbN and CbV translations, respectively (Proposition 25). This result is subtle, one of the main technical contributions of this paper.

First, we extend the definition of translations to contexts. The *CbN and CbV translations for contexts* are two functions (·)<sup>n</sup> : C −→ C! and (·)<sup>v</sup> : C −→ C!, respectively, mapping contexts of the λ-calculus into contexts of the bang calculus:

$$\begin{aligned} \langle \cdot \rangle^{n} &= \langle \cdot \rangle & \langle \cdot \rangle^{\vee} &= \langle \cdot \rangle \\ (\lambda x.\mathbf{c})^{n} &= \lambda x.\mathbf{c}^{n} & (\lambda x.\mathbf{c})^{\vee} &= !(\lambda x.\mathbf{c}^{\vee}) \\ (\mathbf{o}(t\_{1},\ldots,\mathbf{c},\ldots,t\_{k}))^{n} &= \mathbf{o}(t\_{1}^{n},\ldots,\mathbf{c}^{n},\ldots,t\_{k}^{n}) & (\mathbf{o}(t\_{1},\ldots,\mathbf{c},\ldots,t\_{k}))^{\vee} &= \mathbf{o}(t\_{1}^{\vee},\ldots,\mathbf{c}^{\vee},\ldots,t\_{k}^{\vee}) \\ (\mathbf{c}t)^{n} &= \mathbf{c}^{n} !(t^{n}) & (\mathbf{c}t)^{\vee} &= \begin{cases} \mathbf{C} \, t^{\vee} & \text{if } \mathbf{c}^{\vee} = !\mathbf{C} \\ (\mathbf{c} \, \mathbf{c}^{\vee}) t^{\vee} & \text{otherwise} \end{cases} \\ (\mathbf{t}\mathbf{c})^{n} &= \mathbf{t}^{n} !(\mathbf{c}^{\vee}) & (\mathbf{t}\mathbf{c})^{\vee} &= \begin{cases} T \, \mathbf{c}^{\vee} & \text{if } \mathbf{t}^{\vee} = !\mathbf{T} \\ (\mathbf{c} \, \mathbf{t}^{\vee}) t^{\vee} & \text{otherwise} \end{cases} \end{aligned}$$

Note that CbN (resp. CbV) level of a context defined in Section 4.3 increases by 1 whenever the CbN (resp. CbV) translation for contexts adds a !. Thus, CbN and CbV translations preserve, back and forth, the level of a redex and the least-level of a term. Said differently, the level for CbN and CbV is defined in Section 4.3 so as to enable the preservation of level via CbN and CbV translations.

#### **Lemma 23 (Preservation of level via CbN translation).**


*3.* For least-level of a term: *For any term* t ∈ ΛO*, one has* CbN(t) = (t <sup>n</sup>)*.*

#### **Lemma 24 (Preservation of level via CbV translation).**


From the two lemmas above it follows that CbN and CbV translations preserve least-level and internal reductions, back and forth.

**Proposition 25 (Preservation of least-level and internal reductions).** *Let* t ∈ Λ<sup>O</sup> *and* **o** ∈ O*.*


As a consequence, least-level reduction induces factorization in CbN and CbV λ-calculi as soon as it does in the bang calculus. And, by Proposition 22, it is a normalizing strategy in CbN and CbV as soon as it is so in the bang calculus.

**Theorem 26 (Factorization and normalization by translation).** *Let* Λcbn <sup>O</sup> = (ΛO, <sup>→</sup><sup>β</sup> ∪ →O) *and* <sup>Λ</sup>cbv <sup>O</sup> = (ΛO, <sup>→</sup><sup>β</sup><sup>v</sup> ∪ →O)*.*


A similar result will hold also when extending the pure calculi with a rule →<sup>ρ</sup> other than →**<sup>o</sup>**, as long as the translation preserves ρ-redexes, back and forth.

*Remark 27 (Preservation of least-level and of normal forms).* Preservation of normal form and least-level is delicate. For instance, it does not hold with the definition CbV translation (·)<sup>v</sup> in [15,30]. There, the translation <sup>t</sup> <sup>=</sup> rs <sup>∈</sup> <sup>Λ</sup> would be t <sup>v</sup> = (ι!(r<sup>v</sup>))s<sup>v</sup> and then Proposition 22 and Proposition 25 would not hold: ι!(r<sup>v</sup>) is a β!-redex in t <sup>v</sup> (see Remark 11) and hence t <sup>v</sup> would not be normal even though so is t, and (t <sup>v</sup>) = 0 even though CbV(t) = 0. This is why we defined two distinct case when defining (·)<sup>v</sup> for applications, akin to Bucciarelli *et al.* [7].

#### **6 Least-level factorization via bang calculus**

We have shown that least-level factorization in a bang calculus Λ!<sup>O</sup> implies leastlevel factorization in the corresponding CbN and CbV calculi, via forth-and-back translation. The central question now is *how to prove least-level factorization* for a bang calculus: this section is devoted to that, in the pure and applied cases.

**Overview.** Let us overview our approach by considering O = {**o**}, and →= <sup>→</sup><sup>β</sup>! ∪ →**<sup>o</sup>**. Since by definition <sup>→</sup><sup>l</sup> <sup>=</sup> <sup>→</sup><sup>l</sup> <sup>β</sup>! ∪ →<sup>l</sup> **<sup>o</sup>** (and <sup>¬</sup> <sup>→</sup><sup>l</sup> <sup>=</sup> <sup>¬</sup> <sup>→</sup><sup>l</sup> <sup>β</sup>! ∪ →¬<sup>l</sup> **<sup>o</sup>**), Lemma <sup>8</sup> states that we can *decompose* least-level factorization of → in three modules:


Note that, for each of →<sup>l</sup> <sup>β</sup>! and →<sup>l</sup> **<sup>o</sup>**, the least level is defined with respect to the set of *all* redexes R = R<sup>β</sup>! ∪ R**<sup>o</sup>**, so as to have →<sup>l</sup> =→<sup>l</sup> <sup>β</sup>! ∪ →<sup>l</sup> **<sup>o</sup>**. This approach solves the issue we mentioned in Example 1.

Clearly, Points 2 and 3 depend on the specific rule →**<sup>o</sup>**. However, the beauty of a modular approach is that Point 1 can be established in general: we do not need to know →**<sup>o</sup>**, only the shape of its redexes given by R**<sup>o</sup>**. In Section 6.1 we provide a general result of least-level factorization for →<sup>β</sup>! (Theorem 28). In fact, we shall show a bit more: the way of decomposing the study of factorization that we have sketched, can be applied to study least-level factorization of any reduction → = →<sup>β</sup>! ∪ →<sup>ρ</sup>, as long as → has a good least-level.

Once (1) is established (once and for all), to prove factorization of a reduction →<sup>β</sup>! ∪ →**<sup>o</sup>** we are only left with (2) and (3). In Section 6.3 we show that the proof of the two linear swaps can be reduced to a single, simple test, involving only the →**<sup>o</sup>** step (Proposition 34). In Section 7, we will illustrate how all elements play together on a concrete case, applying them to non-deterministic λ-calculi.

### **6.1 Factorization of** *→<sup>β</sup>***! in a bang calculus**

We show that →<sup>β</sup>! *factorizes* via least-level reduction (Theorem 28). This holds for a definition of →<sup>l</sup> <sup>β</sup>! (as in Section 4) where the set of redexes R contains R<sup>β</sup>! ∪ RO—this generalization has essentially no cost, and allows us to use Theorem 28 as a module in the factorization of larger reductions containing →<sup>β</sup>! .

We prove factorization via Takahashi's parallel reduction method [32]. We define a reflexive reduction <sup>⇒</sup>¬lβ! (called parallel internal <sup>β</sup>!-reduction) which fulfills the conditions of Property 7, *i.e.* <sup>⇒</sup>¬lβ! <sup>∗</sup> <sup>=</sup> <sup>¬</sup> →<sup>l</sup> <sup>β</sup>! <sup>∗</sup> and <sup>⇒</sup>¬lβ! ·→<sup>l</sup> <sup>β</sup>! ⊆ →<sup>l</sup> <sup>β</sup>! <sup>∗</sup> ·⇒¬lβ! .

The tricky point is to prove that <sup>⇒</sup>¬lβ! ·→<sup>l</sup> <sup>β</sup>! ⊆ →<sup>l</sup> <sup>β</sup>! <sup>∗</sup> ·⇒¬lβ! We adapt the proof technique in [2]. All details are in [12]. Here we just give the definition of <sup>⇒</sup>¬lβ! .

We first introduce <sup>⇒</sup><sup>β</sup>! :<sup>n</sup> with <sup>n</sup> <sup>∈</sup> <sup>N</sup> ∪ {∞} (the parallel version of −→<sup>β</sup>! :<sup>n</sup>), which fires simultaneously a number of <sup>β</sup>!-redexes at level at least <sup>n</sup> <sup>∈</sup> <sup>N</sup>, and ⇒<sup>β</sup>! :<sup>∞</sup> does not reduce any β!-redex: T ⇒<sup>β</sup>! :<sup>∞</sup> S implies T = S.

$$\begin{array}{llll} \overbrace{x \Rightarrow\_{\beta\_{!} \ltimes} x}^{} & \begin{array}{c} T \Rightarrow\_{\beta\_{!} \ltimes} T'\\ \lambda x.T \Rightarrow\_{\beta\_{!} \ltimes} \lambda x.T' \end{array} & \begin{array}{c} T \Rightarrow\_{\beta\_{!} \ltimes} T' \quad S \Rightarrow\_{\beta\_{!} \ltimes} S'\\ T \Rightarrow\_{\beta\_{!} \ltimes \text{min}\{m,n\}} T'S' \end{array} & \begin{array}{c} T \Rightarrow\_{\beta\_{!} \ltimes} T'\\ \end{array} \\ & \begin{array}{c} T \Rightarrow\_{\beta\_{!} \ltimes} S'\\ \overline{(\lambda x.T)! S \Rightarrow\_{\beta\_{!} \ltimes} T' \{S' \} x \end{array} \end{array}$$

The *parallel internal* <sup>β</sup>!*-reduction* <sup>⇒</sup>¬lβ! is the parallel version of <sup>¬</sup> →<sup>l</sup> <sup>β</sup>! , which fires simultaneously a number of β!-redexes that are not at minimal level. Formally,

<sup>T</sup> <sup>⇒</sup>¬lβ! <sup>S</sup> if <sup>T</sup> <sup>⇒</sup>β! :<sup>n</sup> <sup>S</sup> with <sup>n</sup> <sup>=</sup> <sup>∞</sup> or n > (T).

**Theorem 28 (Least-level factorization of** →<sup>β</sup>!**).** *Let* →<sup>ρ</sup> *be the contextual closure of a rule* →<sup>ρ</sup>*, and assume that* →= →<sup>β</sup>! ∪ →<sup>ρ</sup> *has good least-level in* Λ!O*. Then,* T −→<sup>∗</sup> <sup>β</sup>! S *implies* T →<sup>l</sup> <sup>β</sup>! <sup>∗</sup> · →¬<sup>l</sup> <sup>β</sup>! <sup>∗</sup> S.

In particular, as −→<sup>β</sup>! has a good least-level (Proposition 19) in Λ!, we have:

**Corollary 29 (Least-level factorization in the pure bang calculus).** *In the pure bang calculus* (Λ!, −→<sup>β</sup>! )*, if* T −→<sup>∗</sup> <sup>β</sup>! S *then* T →<sup>l</sup> <sup>β</sup>! <sup>∗</sup> · →¬<sup>l</sup> <sup>β</sup>! <sup>∗</sup> S*.*

*Surface Digression.* According to Definition 15, β!-reduction −→<sup>β</sup>! :0 at level 0 (called *surface reduction* in Simpson [31]) can only fire redexes at level 0, *i.e.*, redexes that are not inside boxes or other operators. It can be equivalently defined as the closure of →<sup>β</sup>! under contexts **S** defined by **S** ::= · | λx.**S** | **S**T | T**S**. Since −→<sup>β</sup>! :0 ⊆ →<sup>l</sup> <sup>β</sup>! , from least-level factorization (Corollary 29) and monotonicity (Proposition 19), a new proof of a result already proven by Simpson [31] follows.

**Corollary 30 (Surface factorization in the pure bang calculus).** *In the pure bang calculus* (Λ!, −→<sup>β</sup>! )*, if* T −→<sup>∗</sup> <sup>β</sup>! S *then* T −→<sup>∗</sup> <sup>β</sup>! :0 ·−→<sup>∗</sup> <sup>β</sup>! :<sup>k</sup> S *with* k > 0*.*

#### **6.2 Pure calculi and least-level normalization**

Least-level factorization of →<sup>β</sup>! implies in particular least-level factorization for →<sup>β</sup> and →<sup>β</sup><sup>v</sup> . As a consequence, least-level reduction is a normalizing strategy for all three pure calculi: the bang calculus, the CbN, and the CbV λ-calculi.

*The pure bang calculus.* The least-level reduction →<sup>l</sup> <sup>β</sup>! is a *normalizing strategy* for →<sup>β</sup>! . Indeed, it satisfies all ingredients in Lemma 4. Since we have least-level factorization (Corollary 29), same normal forms, and *persistence* (Proposition 19), →<sup>l</sup> <sup>β</sup>! is a *complete strategy* for →<sup>β</sup>! : if T →<sup>∗</sup> <sup>β</sup>! S and S is β!-normal, then T →<sup>l</sup> <sup>β</sup>! <sup>∗</sup> S.

We already observed (Example 21) that the least-level reduction →<sup>l</sup> <sup>β</sup>! is non-deterministic, because several redexes at least level may be available. Such non-determinism is however harmless and inessential, because →<sup>l</sup> <sup>β</sup>! is *uniform*.

**Lemma 31 (Quasi-Diamond).** *In the pure bang calculus* (Λ!, −→<sup>β</sup>! )*, the reduction* →<sup>l</sup> <sup>β</sup>! *is quasi-diamond (Property 5), and therefore uniform.*

Putting all the ingredients together, we have (by Lemma 4):

**Theorem 32 (Least-level normalization).** *In the pure bang calculus* (Λ!, −→<sup>β</sup>! )*, the least-level reduction* →<sup>l</sup> <sup>β</sup>! *is a normalizing strategy for* →<sup>β</sup>! *.*

Theorem 32 means not only that if T is weakly β!-normalizing then T can reach its normal form by just performing least-level steps, but also that performing *whatever* least-level steps eventually leads to the normal form, if any.

*Pure CbV and CbN* λ*-calculi.* By forth-and-back translation (Theorem 26) the least-level factorization and normalization results for the pure bang calculus immediately transfers to the (pure) CbN and CbV settings.

#### **Theorem 33 (CbV and CbN least-level normalization).**

**–** CbN*: In* (Λ, →<sup>β</sup>)*,* →<sup>l</sup> <sup>β</sup> *is a normalizing strategy for* →<sup>β</sup>*.*

**–** CbV*: In* (Λ, →<sup>β</sup><sup>v</sup> )*,* →<sup>l</sup> <sup>β</sup><sup>v</sup> *is a normalizing strategy for* →<sup>β</sup><sup>v</sup> *.*

#### **6.3 Least-level Factorization, Modularly**

Least-level factorization of →<sup>β</sup>! (Theorem 28) can be used to prove factorization for a more complex calculus. Indeed, a simple and modular *test* establishes least-level factorization of a reduction →<sup>β</sup>! ∪ →<sup>ρ</sup> (→<sup>ρ</sup> is a reduction added to →<sup>β</sup>! ), by adapting a similar result in [3]. The test relies on the fact that we have already proved Theorem 28, and it *simplifies* Lemma 8: the proof of the two linear swaps of Lemma 8 is reduced to a single, easier check, which only involves the rule →<sup>ρ</sup>. As usual, the least level in →<sup>l</sup> <sup>β</sup>! and →<sup>l</sup> <sup>ρ</sup> is defined with respect to the set R = R<sup>β</sup>!∪R<sup>ρ</sup> of redexes. An example of the use of this test is in Section 7.

**Proposition 34 (Modular test for least-level factorization).** *Let* →<sup>ρ</sup> *be the contextual closure of a rule* →<sup>ρ</sup>*, and assume that* →= →<sup>β</sup>! ∪ →<sup>ρ</sup> *has a good least-level in* Λ!O*. Then* → *factorizes via* →<sup>l</sup> = →<sup>l</sup> <sup>β</sup>! ∪ →<sup>l</sup> <sup>ρ</sup> *if the following hold:*

→<sup>ρ</sup> R {T /x}*;*
→<sup>ρ</sup> ⊆ 
→<sup>ρ</sup> · →<sup>∗</sup> β! *.*

### **7 Case study: non-deterministic** *λ***-calculi**

To show how to use our framework, we apply the tools we have developed on our running example (see Examples 1 and 9). We extend the bang calculus with a *nondeterministic* binary operator ⊕, that is, (Λ!⊕, →<sup>β</sup>!⊕) where →<sup>β</sup>!<sup>⊕</sup> = →<sup>β</sup>! ∪ →⊕, and →<sup>⊕</sup> is the contextual closure of the (non-deterministic) rules:

$$(\oplus(T,S)\mapsto\_{\oplus} T \qquad\qquad\qquad\oplus(T,S)\mapsto\_{\oplus} S.$$

*First step: non-deterministic bang calculus.* We analyze Λ!⊕. We use our modular test to prove least-level factorization for Λ!⊕: if T →<sup>∗</sup> <sup>β</sup>!<sup>⊕</sup> <sup>U</sup> then <sup>T</sup> <sup>→</sup><sup>l</sup> <sup>β</sup>!⊕<sup>∗</sup> · →¬<sup>l</sup> <sup>β</sup>!⊕<sup>∗</sup> U. By Lemma 4, an immediate consequence of the factorization result is that the least-level strategy is *complete*: if U is normal, T →<sup>∗</sup> <sup>β</sup>!<sup>⊕</sup> <sup>U</sup> implies <sup>T</sup> <sup>→</sup><sup>l</sup> <sup>β</sup>⊕<sup>∗</sup> <sup>U</sup>.

*Second step: CbN and CbV non-deterministic calculi.* By translation, we have *for free*, that the analogous results hold in Λcbn <sup>⊕</sup> and <sup>Λ</sup>cbv <sup>⊕</sup> , as defined in Example 9. So, least-level factorization holds for both calculi, and moreover


*What do we really need to prove?* The only result we need to prove is leastlevel factorization of →<sup>β</sup>!⊕. Completeness then follows by Lemma 4 and the translations will automatically take care of transferring the results.

To prove factorization of →<sup>β</sup>!⊕, most of the work is done, since least-level factorization of →<sup>β</sup>! is already established; we then use our test (Proposition 34) to extend →<sup>β</sup>! with →⊕. The only ingredients we need are substitutivity of →<sup>⊕</sup> (which is an obvious property), and the following easy lemma.

**Lemma 35 (Roots).** *Let* <sup>ρ</sup> ∈ {β!, ⊕}*. If* <sup>T</sup> <sup>¬</sup> <sup>→</sup><sup>l</sup> <sup>ρ</sup> <sup>R</sup> <sup>→</sup><sup>⊕</sup> <sup>S</sup> *then* <sup>T</sup> <sup>→</sup><sup>⊕</sup> · →<sup>=</sup> <sup>ρ</sup> S*.*

**Theorem 36 (Least-level factorization in non-deterministic calculi).**


*Proof.* 1. It is enough to verify the hypotheses of Proposition 34, via Lemma 35. 2. It follows from Theorem 26 and Theorem 36.1.

Completeness is the best that can be achieved in these calculi, because of the true non-determinism of →<sup>⊕</sup> and hence of least-level reduction and of any other complete strategy for −→. For instance, in <sup>Λ</sup>cbn <sup>⊕</sup> there is no normalizing strategy for ⊕(x, δδ) in the sense of Definition 2, since x ←<sup>l</sup> <sup>⊕</sup> ⊕(x, δδ) →<sup>l</sup> <sup>⊕</sup> δδ →<sup>l</sup> <sup>β</sup> ... .

#### **8 Conclusions and Related Work**

Combining translations (Theorem 26), least-level factorization for →<sup>β</sup>! (Theorem 28), and modularity (Proposition 34), gives us a powerful method to analyze factorization in various λ-calculi that *extend* the pure CbN and CbV calculi. The main novelty is transferring the results from a calculus to another via translations.

*Related Work.* Many calculi inspired by linear logic subsume CbN and CbV, such as [5,6,29,24] (other than the ones already cited). We chose the bang calculus for its simplicity, which eases the analysis of the CbN and CbV translations.

To study CbN and CbV in a uniform way, an approach orthogonal to ours is given by Ronchi della Rocca and Paolini's parametric λ-calculus [28]. It is a *meta-calculus*, where the reduction rule is *parametric* with respect to a subset of terms (called values) with suitable properties. Different choices for the set of values define different calculi—that is, different reductions. This allows for a uniform presentation of proof arguments, such as the proof of standardization, which is actually a *meta-proof* that can be instantiated in both CbN and CbV.

Least-level reduction is studied for calculi based on linear-logic in [34,1] and for linear logic proof-nets in [8,26]. It is studied for pure CbN λ-calculus in [2].

*Acknowledgments.* The authors thank Beniamino Accattoli for insightful comments and discussions. This work was partially supported by EPSRC Project EP/R029121/1 *Typed Lambda-Calculi with Sharing and Unsharing*.

### **References**


Conference (RTA 2005). Lecture Notes in Computer Science, vol. 3467, pp. 219–234 (2005). https://doi.org/10.1007/978-3-540-32033-3 17


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Generalized Bounded Linear Logic and its Categorical Semantics**

Y¯oji Fukihara<sup>1</sup>(-) and Shin-ya Katsumata<sup>2</sup>

<sup>1</sup> Kyoto University, Kyoto, Japan fukihara@kurims.kyoto-u.ac.jp <sup>2</sup> National Institute of Informatics, Tokyo, Japan s-katsumata@nii.ac.jp

**Abstract.** We introduce a generalization of Girard et al.'s BLL called GBLL (and its affine variant GBAL). It is designed to capture the core mechanism of dependency in BLL, while it is also able to separate complexity aspects of BLL. The main feature of GBLL is to adopt a multiobject pseudo-semiring as a grading system of the !-modality. We analyze the complexity of cut-elimination in GBLL, and give a translation from BLL with constraints to GBAL with positivity axiom. We then introduce indexed linear exponential comonads (ILEC for short) as a categorical structure for interpreting the !-modality of GBLL. We give an elementary example of ILEC using folding product, and a technique to modify ILECs with symmetric monoidal comonads. We then consider a semantics of BLL using the folding product on the category of assemblies of a BCI-algebra, and relate the semantics with the realizability category studied by Hofmann, Scott and Dal Lago.

**Keywords:** Linear Logic · Categorical Semantics · Linear Exponential Comonad · Graded Comonad

### **1 Introduction**

Girard's *linear logic* is a refinement of propositional logic by restricting weakening and contraction in proofs [15]. Linear logic also has an *of-course modality* !, which restores these structural rules to formulas of the form !A.

Later, Girard et al. extended the !-modality with quantitative information so that usage of !-modal formulas in proofs can be quantitatively controlled [16]. This extension, called *bounded linear logic* (BLL for short), is successfully applied to a logical characterization of P-time computations.

Their extension takes two steps. First, the !-modality is extended to the form !rA, where the index r is an element of a semiring [16, Section 2.4]. The index r is called *grade* in modern terminology [11,13]. This extension and its variants have been employed in various logics and programming languages [7,30,14,26,28]. The categorical structure corresponding to !rA is identified as *graded linear exponential comonad* [7,13,22].

Second, the !r-modality is further extended to the form !x<pA, where p is a polynomial (called *resource polynomial*) giving the upper bound of x [16, Section 3]. The formula !x<pA also binds free occurrences of the resource variable © The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 226–246, 2021. https://doi.org/10.1007/978-3-030-71995-1 12

x in resource polynomials in A. Therefore, in BLL, both formulas and resource polynomials depend on the values stored in free resource variables. This dependency mechanism significantly increases the expressiveness of BLL, leading to a characterization of P-time complexity.

This characterization result was later revisited through a *realizability semantics* of BLL [16,19,10]. Inside this semantics, however, mechanisms for controlling complexity of program execution are hard-coded, and it is not very clear which semantics structure realizes the dependency mechanism of BLL. This leads us to seek a logical and categorical understanding of BLL's dependency mechanism hidden underneath the complexity-related features, such as resource polynomials and computability constraints.

As a result of the quest, we propose a generalization of BLL called GBLL, and study its categorical semantics. The central idea of the generalization is to replace the grading semiring of the !r-modality with a particular *multi-object pseudo-semiring* realized as a 2-category. Let us see how this replacement works. In GBLL, each formula is formed by deriving a judgment of the form Δ A, where Δ is a set (called *index set*) and A is a raw formula. We may think that such a well-formed formula <sup>Δ</sup> <sup>A</sup> denotes a <sup>Δ</sup>-indexed family {-<sup>A</sup><sup>i</sup>}<sup>i</sup>∈<sup>Δ</sup> of denotations. The formation rule for !-modal formula in GBLL is the following:

$$\frac{\Delta' \vdash A \quad f \in \mathbf{Set}(\Delta, (\Delta')^\*)}{\Delta \vdash !\_f A} \quad \quad ((\\_)^\* \colon \text{Kleene closure})$$

where the function f abstractly represents dependency. This modality is enough to express the !x<p-modality of BLL: we express the bindig x<p under a resource variable context y as the *function* fp(y)=(y, 0)···(y, p(y) − 1) that returns the list of environments extended with values less than p(y). Then the denotation of the !fA-modality is given by a variable-arity operator D. For each index i ∈ Δ, the denotation is given by applying D to the denotations obtained by mapping A to list f(i):

$$\|\mathbb{I}\_f A\|\_i = D(\|A\|\_{j\_1}, \cdots, \|A\|\_{j\_n}) \text{ where } j\_1 \cdots j\_n = f(i).$$

A simple example of a variable-arity modal operator is the *folding product* D(X1, ··· , Xn) = X<sup>1</sup> ⊗···⊗ Xn.

The pseudo-semiring structure on the class of functions of the form Δ → (Δ )<sup>∗</sup> is given as follows. For the multiplication g • f, we adopt the *Kleisli composition* of the free monoid monad ( )∗, while for the addition f+g, the pointwise concatenation (f + g)(x) = f(x)g(x). However, these operations fail to satisfy one of the semiring axioms: (f + g) • h = f • h + g • h. To fix this, we introduce (pointwise) list permutations as 2-cells between functions of type Δ → (Δ )∗. These data form a 2-category **Idx**, which may be seen as a multi-object pseudosemiring. Weakening, contraction, digging and dereliction in GBLL interact with these operations, much like the !r-modality in [7].

We first study syntactic properties of GBLL. We introduce cut-elimination to GBLL and study its complexity property. It turns out that the proof technique used in BLL naturally extends to GBLL — as done in [16], we classify cuts into *reducible* and *irreducible* ones, introduce *proof weight*, and show that the reduction steps of reducible cuts will terminate in cubic time of proof weights. We also examine the expressive power of GBLL by giving a translation from an extension of BLL with *constraints* that are seen in Dal Lago et al.'s QBAL [10].

We next give a categorical semantics of GBLL. We introduce the concept of *indexed linear exponential comonad* (*ILEC* ); it is an **Idx**-graded linear exponential comonad satisfying a commutativity condition with respect to an underlying indexed SMCCs. Then, we present a construction of ILEC from a symmetric monoidal closed category C with a symmetric monoidal comonad on it. We apply this construction to the case where C is the category of assemblies over a BCI algebra [2,20], and relate the semantics of GBLL with the constructed ILEC and the realizability category studied in [19,10].

*Acknowledgment* The first author was supported by JST ERATO HASUO Metamathematics for Systems Design Project (No. JPMJER1603). The authors are grateful to anonymous reviewers for comments, and Masahito Hasegawa, Naohiko Hoshino, Clovis Eberhart and J´er´emy Dubut for fruitful discussions.

*Preliminaries* For a set Δ, by Δ<sup>∗</sup> we mean the set of finite sequences of Δ. The empty sequence is denoted by (). Juxtaposition of Δ∗-elements denotes the concatenation of sequences. For x ∈ Δ∗, by |x| we mean the length of x. We identify a natural number n and the set {0, ··· , n−1}; note that 0 = ∅. We also identify a sequence x ∈ Δ<sup>∗</sup> and the function "λi ∈ |x| . the i-th element of x".

### **2 Generalized Bounded Linear Logic**

### **2.1 Indexing 2-Category**

We first introduce a 2-category **Idx** (and its variant **Idx**a), which may be seen as a multi-object pseudo-semiring. It consists of the following data<sup>3</sup>: 0-cells are sets (called index sets), and the hom-category **Idx**(Δ, Δ ), which is actually a groupoid, is defined by:


The identity 1-cell and the composition of 1-cells in **Idx** are denoted by i<sup>Δ</sup> and (•), respectively. The composition is defined by (g•f)(x) def = g(y1)··· g(yn) where y<sup>1</sup> ··· y<sup>n</sup> = f(x). The hom-category **Idx**(Δ, Δ ) has a symmetric strict monoidal structure:

**–** the monoidal unit is the constant empty-sequence function 0(x) = (),

<sup>3</sup> This is a full sub-2-category of the Kleisli 2-category **CAT**<sup>S</sup> , where <sup>S</sup> is the 2-monad of symmetric strict monoidal category [21].

**–** the tensor product of f,g, denoted by f + g, is defined by the index-wise concatenation (f + g)(x) def = f(x)g(x).

We write J : **Set** → **Idx** for the inclusion, namely JΔ = Δ and (Jf)(x) = f(x) (the singleton sequence).

**Proposition 2.1.** *The composition* • *is symmetric strong monoidal in each argument. Especially, we have*

$$f \bullet 0 = 0 \quad 0 \bullet f = 0 \quad f \bullet (g+h) = f \bullet g + f \bullet h \quad (f+g) \bullet h \cong f \bullet h + g \bullet h.$$

We also define **Idx**<sup>a</sup> by replacing "bijection" in the definition of 2-cell of **Idx** with "injection". The hom-category **Idx**a(Δ, Δ ) has the 1-cell 0 as the terminal object, hence is a symmetric *affine* monoidal category.

#### **2.2 Formulas and Proofs**

**Definition of GBLL Formulas** We first fix a set-indexed sets {A(Δ)}<sup>Δ</sup>∈**Set** of atomic propositions. Formulas are defined by the following BNF:

$$A ::= a \star r \mid A \otimes A \mid A \multimap A \mid !\_f A$$

where a ∈ A(Δ) for some set Δ, r is a function (called *reindexing function*) and f is a 1-cell in **Idx**. Formula formation rules are introduced to derive the pair Δ A of an index set Δ and a formula A. They are defined as follows:

$$\begin{array}{c} \begin{array}{ccc} a \in \mathcal{A}(\Delta') & r \in \mathbf{Set}(\Delta, \Delta') \\ \hline \end{array} & \begin{array}{ccc} \begin{array}{c} \Delta \vdash A \\ \hline \end{array} & \begin{array}{c} \begin{array}{c} \Delta \vdash A \\ \hline \end{array} & A \vDash B \end{array} & \begin{array}{c} \begin{array}{c} \Delta \vdash A \\ \hline \end{array} & \begin{array}{c} \Delta \vdash A \\ \hline \end{array} & \begin{array}{c} \Delta \vdash B \end{array} \end{array} \end{array}$$
 
$$\begin{array}{c} \begin{array}{c} \begin{array}{c} \Delta' \vdash A \quad f \in \mathbf{Idx}(\Delta, \Delta') \\ \hline \end{array} \end{array}$$

The formula a
r represents the atomic formula a precomposed with a reindexing function r. We write **Fml**(Δ) = {A | Δ A}.

We next introduce the reindexing operation on formulas.

**Definition 2.1.** *For a reindexing function* r ∈ **Set**(Δ, Δ )*, we define the* reindexing operator ( )|<sup>r</sup> : **Fml**(Δ ) → **Fml**(Δ) *along* r *by*

$$a \star r|\_{r'} \stackrel{\text{def}}{=} a \star (r \circ r'), \qquad \qquad (A \otimes B)|\_{r} \stackrel{\text{def}}{=} A|\_{r} \otimes B|\_{r},$$

$$(A \rightharpoonup B)|\_{r} \stackrel{\text{def}}{=} A|\_{r} \rightharpoonup B|\_{r}, \qquad \qquad (!\_{f}A)|\_{r} \stackrel{\text{def}}{=} !\_{f \bullet Jr} A.$$

*We routinely extend reindexing operators to sequences of formulas well-formed under a common index set.*

We quotient the set of well-formed formulas by the least congruent equivalence relation generated from the following binary relation:

$$\{ (!\_{Jr\bullet f} A, !\_{f} (A|\_{r})) \mid r \in \mathbf{Set}(\Delta', \Delta''), f \in \mathbf{Idx}(\Delta, \Delta'), \Delta'' \vdash A \} \tag{2.1}$$

We see some formations of formulas in GBLL.

*Example 2.1.* Let us illustrate how a formula !y<x<sup>2</sup> !z<x+<sup>y</sup>A in BLL is represented in GBLL; here we assume that x, y, z are the only resource variables used in this formula. We first introduce a notation. Let E be a mathematical expression using variables **<sup>x</sup>**<sup>1</sup> ··· **<sup>x</sup>**n. Then by [E]<sup>n</sup> : <sup>N</sup><sup>n</sup> <sup>→</sup> (N<sup>n</sup>+1)<sup>∗</sup> we mean the function

$$\{E[\cdot]\_n(\vec{x}) = (\vec{x},0)(\vec{x},1)\cdots(\vec{x},E[x\_1/\mathbf{x}\_1,\cdots,x\_n/\mathbf{x}\_n] - 1) \ \ (\vec{x} \stackrel{\triangle}{=} (x\_1,\cdots,x\_n) \in \mathbb{N}^n)\}$$

For instance, [**x**<sup>2</sup> <sup>1</sup>]1(x)=(x, 0), ··· ,(x, x<sup>2</sup> <sup>−</sup> 1). Then from a well-formed formula <sup>N</sup><sup>3</sup> <sup>A</sup>, we obtain <sup>N</sup> ![**x**<sup>2</sup> <sup>1</sup>]<sup>1</sup> ![**x**1+**x**2]2A. Generalizing this, a BLL formula !x<EA containing resource variables **x**1, ··· , **x**<sup>n</sup> corresponds to the GBLL formula ![E]<sup>n</sup> A.

*Example 2.2.* We look at how we express the substitution of a resource polynomial <sup>A</sup>[<sup>x</sup> := <sup>p</sup>(x1, ..., xn)]. We define a function <sup>p</sup> <sup>n</sup> : <sup>N</sup><sup>n</sup> <sup>→</sup> <sup>N</sup><sup>n</sup>+1 by

$$
\langle p \rangle\_n(x\_1, \ldots, x\_n) \stackrel{\text{def}}{=} (x\_1, \ldots, x\_n, p(x\_1, \ldots, x\_n))\ .
$$

Then the reindexed formula <sup>N</sup><sup>n</sup> <sup>A</sup>|<sup>p</sup> <sup>n</sup> corresponds to <sup>A</sup>[<sup>x</sup> := <sup>p</sup>(x1, ··· , xn)].

*Example 2.3.* We illustrate the equality between well-formed formulas. Consider a formula <sup>N</sup> <sup>A</sup> and a function <sup>r</sup> <sup>∈</sup> **Set**(N<sup>3</sup>, <sup>N</sup>). Then we equate formulas <sup>N</sup><sup>2</sup> ![**x**1+**x**2]<sup>2</sup> (A|<sup>r</sup>) and <sup>N</sup><sup>2</sup> !hA, where <sup>h</sup> <sup>∈</sup> **Idx**(N<sup>2</sup>, <sup>N</sup>) is given by

$$h \stackrel{\text{def}}{=} Jr \bullet [\mathbf{x}\_1 + \mathbf{x}\_2]\_2(x, y) = r(x, y, 0), \dots, r(x, y, x + y - 1).$$

**Definition of GBLL Proofs** A *judgment* of GBLL is the form Δ | Γ A, where Δ is an index set, Γ is a sequence of formulas well-formed under Δ, and A is a well-formed formula under Δ, respectively. The inference rules of GBLL are presented in Fig. 1. Similarly, we define GBAL to be the system obtained by replacing **Idx** in Fig. 1 with **Idx**a.

*Example 2.4.* We mimic a special case of the contraction rule in BLL

$$\frac{\Gamma, !\_{x < x\_i} A, !\_{y < x\_j} A \{ {x\_i + y / x} \} \vdash B}{\Gamma, !\_{x < x\_i + x\_j} A \vdash B}$$

See also (!C)-rule of CBLL in Section 3.2. We use the *shift function* sn,i ∈ **Set**(N<sup>n</sup>+1, <sup>N</sup><sup>n</sup>+1) defined by <sup>s</sup>n,i(x1, ··· , xn, y) def = (x1, ··· , xn, x<sup>i</sup> <sup>+</sup> <sup>y</sup>). Then we easily see [**x**i]<sup>n</sup> + Jsn,i • [**x**<sup>j</sup> ]<sup>n</sup> = [**x**<sup>i</sup> + **x**<sup>j</sup> ]n. By contraction rule of GBLL, we obtain the following derivation for well-formed formulas <sup>N</sup><sup>n</sup>+1 <sup>A</sup> and <sup>N</sup><sup>n</sup> <sup>B</sup>, mimicking the contraction of BLL:

$$\frac{\mathop{!\!\!\!\!\!\_{[\mathbf{x}\_i\rceil\_n} A, \operatorname{!}\!\!\_{[\mathbf{x}\_j\rangle\_n} (A|s\_{n,i})} \vdash B}{\mathop{!\!\!\!\_{[\mathbf{x}\_i+\mathbf{x}\_j\rangle\_n} A = \operatorname{!}\!\!\_{[\mathbf{x}\_i\rangle\_n+J s\_{n,i}\bullet [\mathbf{x}\_j]\_n} A \vdash B}}$$

Here, we use the formula equality !Jsn,i•[**x**<sup>j</sup> ]<sup>n</sup> A = ![**x**<sup>j</sup> ]<sup>n</sup> (A|sn,j ).

<sup>Δ</sup> <sup>A</sup> (Ax) Axiom <sup>Δ</sup> <sup>|</sup> <sup>A</sup> <sup>A</sup> Δ | Γ, X, Y, Γ- <sup>A</sup> (Exch) Exchange <sup>Δ</sup> <sup>|</sup> Γ, Y, X, Γ- A <sup>Δ</sup> <sup>|</sup> <sup>Γ</sup><sup>1</sup> A Δ <sup>|</sup> <sup>Γ</sup>2, A <sup>B</sup> (Cut) <sup>Δ</sup> <sup>|</sup> <sup>Γ</sup>1, Γ<sup>2</sup> <sup>B</sup> <sup>Δ</sup> <sup>|</sup> Γ, X, Y <sup>A</sup> (⊗L) <sup>Δ</sup> <sup>|</sup> Γ, X <sup>⊗</sup> <sup>Y</sup> <sup>A</sup> <sup>Δ</sup> <sup>|</sup> <sup>Γ</sup><sup>1</sup> X Δ <sup>|</sup> <sup>Γ</sup><sup>2</sup> <sup>Y</sup> (⊗R) <sup>Δ</sup> <sup>|</sup> <sup>Γ</sup>1, Γ<sup>2</sup> <sup>X</sup> <sup>⊗</sup> <sup>Y</sup> <sup>Δ</sup> <sup>|</sup> <sup>Γ</sup><sup>1</sup> X Δ <sup>|</sup> <sup>Γ</sup>2, Y <sup>B</sup> (L) <sup>Δ</sup> <sup>|</sup> <sup>Γ</sup>1, Γ2, X <sup>Y</sup> <sup>B</sup> <sup>Δ</sup> <sup>|</sup> Γ, X <sup>Y</sup> (R) <sup>Δ</sup> <sup>|</sup> <sup>Γ</sup> <sup>X</sup> <sup>Y</sup> <sup>Δ</sup> <sup>|</sup> <sup>Γ</sup> <sup>B</sup> (!W) Weakening <sup>Δ</sup> <sup>|</sup> Γ, !0<sup>A</sup> <sup>B</sup> <sup>Δ</sup> <sup>|</sup> Γ, A <sup>B</sup> (!D) Dereliction <sup>Δ</sup> <sup>|</sup> Γ, !id<sup>A</sup> <sup>B</sup> <sup>Δ</sup> <sup>|</sup> Γ, !g<sup>A</sup> B σ <sup>∈</sup> **Idx**(Δ, Δ- )(f,g) (!F) !-Functor <sup>Δ</sup> <sup>|</sup> Γ, !f<sup>A</sup> <sup>B</sup> <sup>Δ</sup> <sup>|</sup> Γ, !f<sup>1</sup>A, !f<sup>2</sup><sup>A</sup> <sup>B</sup> (!C)Contraction <sup>Δ</sup> <sup>|</sup> Γ, !f1+f<sup>2</sup><sup>A</sup> <sup>B</sup> Δ- <sup>|</sup> !g<sup>1</sup>A1, ··· , !g<sup>k</sup>Ak B f <sup>∈</sup> **Idx**(Δ, Δ- ) (P!) Composition <sup>Δ</sup> <sup>|</sup> !g1•fA1, ··· , !gk•fAk !f<sup>B</sup>

**Fig. 1.** GBLL Proof Rules

*Example 2.5.* The reindexing operator can be extended to proofs. Let r be a reindexing function in **Set**(Δ, Δ ). Reindexing of the axiom rule Δ | A A, by r is the axiom rule Δ | A|<sup>r</sup> A|<sup>r</sup>. Reindexing of other rules except (P!) can be easily defined—the judgment Δ | Γ A in each rule is replaced with Δ | Γ|<sup>r</sup> A|<sup>r</sup> by reindexing. For (P!) rule, reindexing by r is given as follows:

$$\frac{\Delta^{\prime\prime} \mid !\_{g\_1} A\_1, \dots \cdot \text{,} !\_{g\_k} A\_k \vdash B \qquad f \bullet J r \in \mathbf{Idx}(\Delta, \Delta^{\prime\prime})}{\Delta \mid (!\_{g\_1 \bullet f} A\_1)|\_r, \dots \cdot \text{,} (!\_{g\_k \bullet f} A\_k)|\_r \vdash (!\_f B)|\_r}$$

*Remark 2.1.* In this paper, indexing 2-category is either **Idx** or **Idx**a. Allowing more general indexing 2-categories in GBLL is a future work. In his PhD thesis, Breuvart designed a linear logic similar to GBLL upon an abstract indexing mechanism called *dependent semirings* [5, Definition 3.2.4.5]. It consists of categories (S, U) such that 1) each hom-set in S carries a (not necessarily commutative) ordered monoid structure (0, +) and the composition of S distributes over 0, +, and 2) U acts on S from both sides. Roughly speaking, S and U corresponds to our **Idx**op and **Set**op, respectively. We expect that a unification of dependent semirings and 2-categories **Idx**, **Idx**<sup>a</sup> would yield a suitable generalization of indexing categories for GBLL. This generalization will subsume the non-graded linear logic, and allow us to compare GBLLs over different idexing categories.

#### **2.3 Complexity of Cut-Elimination in GBLL**

By a similar discussion to BLL [16], instances of Cut inference are divided in two classes: *reducible* cuts and *irreducible* cuts. We define the *weight* of proof |π| for each proof πΔ | Γ A and *reduction steps* of proofs, such that every reduction steps will terminate, for each index δ ∈ Δ, in polynomial steps of |π|(δ).

**Definition 2.2.** *[16, Appedix A] In* GBLL *(resp.* GBAL*) proofs, an instance of the Cut inference is* irreducible *if there are at least one Composition rule below it or if its left premise is obtained by a Composition rule with nonempty context and the other premise is obtained by a Weakening, !-Functor, Dereliction, Contraction or Composition inference. A* reducible *cut is Cut inferences that is not irreducible.*

The definition of (ir)reducibility and weight is diverted from Girard's paper. Therefore, our system inherits from BLL the conditions under which cuts can be reduced. See also Section 2.4 in [16].

**Definition 2.3.** *A* GBLL *or* GBAL *proof is* irreducible *if it contains only irreducible cut inferences.*

Following [16], we introduce the concept of *weight* of a proof. It is a function <sup>|</sup>π<sup>|</sup> : <sup>Δ</sup> <sup>→</sup> <sup>N</sup> assigning a weight number <sup>|</sup>π|(δ) to a proof <sup>π</sup> at an index <sup>δ</sup> <sup>∈</sup> <sup>Δ</sup>. The weight number never increases at any reduction step of Cut in π. In the original BLL, weights are expressed by resource polynomials, while here, they are generalized to arbitrary functions. We remark that weights of the proofs involving Composition rules, which introduce !<sup>f</sup> modality, use the length of the lists constructed by f.

**Definition 2.4.** *For a given proof* πΔ | Γ A *of* GBLL *or* GBAL*, the* weight *of* <sup>π</sup> *is a function* <sup>|</sup>π<sup>|</sup> : <sup>Δ</sup> <sup>→</sup> <sup>N</sup> *inductively defined as follows. A) When* <sup>Δ</sup> <sup>=</sup> <sup>∅</sup>*,* |π| *is the evident function. B) When* Δ = ∅*,* |π| *is defined by the following rules:*


$$\pi \rhd \frac{\Delta' \mid !\_{\alpha\_1} A\_1, \dots, !\_{\alpha\_k} A\_k \vdash B}{\Delta \mid !\_{\alpha\_1 \bullet f} A\_1, \dots, !\_{\alpha\_k \bullet f} A\_k \vdash !\_f B}$$

*then* <sup>|</sup>π|(δ) def = <sup>γ</sup>∈f(δ) (|π |(γ)+2k + 1) + k + 1*. Note that the summation* <sup>γ</sup>∈f(δ) *scans all elements in the list* <sup>f</sup>(δ)*, hence the weight depends on the length of* f(δ)*.*

**Theorem 2.1.** *For every proof* πΔ | Γ A *and every* δ ∈ Δ*, reduction steps of reducible cuts will terminate in at most* (|π|(δ))<sup>3</sup> *steps.*

*Proof (sketch).* The proof is almost the same as Section 2.2 and Appendix A of [16], except for the definition of the weight. Suppose that π one-step reduces into π . From the definition of the weight, either 1) for all index δ ∈ Δ, the weight decreases (that is, |π|(δ) > |π |(δ)), or 2) for all index δ ∈ Δ, the weight keeps (that is, |π|(δ) = |π |(δ)). The reduction of the former type is called symmetric or axiom reduction [16, Section 2.2.1 and 2.2.2], while the latter commutative reduction [16, Section 2.2.3].

In the case where the weight keeps, we introduce another measure called the *cut size* π : <sup>Δ</sup> <sup>→</sup> <sup>N</sup> of a proof <sup>π</sup>. Its definition is the same as the definition of weight except for Cut rule. For a proof π obtained by Cut rule from π<sup>1</sup> and π2, the cut size π(δ) is defined to be π1(δ) + π2(δ) + |π1|(δ) + |π2|(δ).

In each commutative reduction from π to π the cut size decrease at all index (that is, for all δ ∈ Δ, ||π||(δ) > ||π ||(δ)), and the cut size is at most the square of the weight (that is, for all <sup>δ</sup> <sup>∈</sup> Δ, ||π||(δ) <sup>≤</sup> (|π|(δ))<sup>2</sup>). Therefore, the total number of steps is at most the cube of the weight.

The number of reduction steps of a proof π and its weight depend on the length of lists computed by the **Idx**-morphisms occurring in π. However, to discuss the actual time complexity of cut-elimination, we further need to take into account the time complexity of the computation of **Idx**-morphisms. This would be achieved by looking at a subcategory of **Idx** computable within a certain time complexity. We leave this argument of analyzing the actual time complexity of cut-elimination as a future work.

#### **3 Translation from Constrained BLL**

We show that GBLL can express BLL via a translation. This translation is actually given to variants of these calculi, namely from BLL *with constraints* (called CBLL) to GBAL *with positivity axioms* (called GBAL<sup>+</sup>).

CBLL is an extension of BLL with *constraints*, which are one of the features of Dal Lago and Hofmann's QBAL [10]. Constraints explicitly specify conditions imposed on resource variables, and it is natural to explicitly maintain these conditions throughout proofs. We also remark that in CBLL, weakening of ! formulas !x<p+<sup>q</sup>A !x<pA is allowed, and atomic formulas are assumed to satisfy the positivity property (3.1).

GBAL<sup>+</sup> is designed for a sound translation from CBLL. Recall that GBAL is an extension of GBLL with weakening !<sup>f</sup>+<sup>g</sup>A !fA on !-formulas. Then GBAL<sup>+</sup> is a further extension of GBAL with the following positivity axioms of atomic formulas: for every n-ary atomic formula a ∈ A in CBLL, we introduce an atomic formula [a] ∈ A(N<sup>n</sup>) to GBAL together with the axiom:

$$V\_{\theta'}(F) \mid \mathcal{O} \vdash [a] \star \langle p\_1, \dots, p\_n \rangle \multimap [a] \star \langle q\_1, \dots, q\_n \rangle \quad (\forall i. p\_i \sqsubseteq\_{\theta'} q\_i).$$

Here the definition of each notation is given in Section 3.1 and 3.3. Positivity axiom induces proofs V<sup>C</sup> (F) | A A for every two formulas A, A such that A <sup>C</sup> A (the relation <sup>C</sup> for formulas is defined in Section 3.2).

#### **3.1 Resource Polynomials and Constraints**

We introduce basic concepts around CBLL, referring to its super-logic QBAL [10]. We put a reference in the beginning of each paragraph when the contents come from QBAL in [10].

[10, Definition 2.1] Given a countably infinite set RV of *resource variables*, a *resource monomial* over RV is a finite product of binomial coefficients <sup>m</sup> <sup>i</sup>=1 <sup>x</sup><sup>i</sup> ni , where the resource variables <sup>x</sup>1, ··· , x<sup>m</sup> are distinct and <sup>n</sup>1, ··· , n<sup>m</sup> <sup>∈</sup> <sup>N</sup> are natural numbers. A *resource polynomial* over RV is a finite sum of resource monomials. We write 1 as <sup>x</sup> 0 and x as <sup>x</sup> 1 for short. Each positive natural number n denotes a resource polynomial 1 + 1 + ··· + 1. Resource polynomials are closed under sum, product, bounded sum and composition [10, Lemma 2.2].

[10, Definition 2.3] A *constraint* is an inequality p ≤ q, where p and q are resource polynomials. We abbreviate p+ 1 ≤ q as p<q. A constraint p ≤ q *holds* (written <sup>p</sup> <sup>≤</sup> <sup>q</sup>) if it is true in the standard model. A *constraint set* (denoted with C , D) is a finite set of constraints. A constraint p ≤ q *is a consequence* of a constraint set <sup>C</sup> (written <sup>C</sup> <sup>p</sup> <sup>≤</sup> <sup>q</sup>) if <sup>p</sup> <sup>≤</sup> <sup>q</sup> is a logical consequence of <sup>C</sup> . For every constraint sets <sup>C</sup> and <sup>D</sup>, we write <sup>C</sup> <sup>D</sup> iff <sup>C</sup> <sup>p</sup> <sup>≤</sup> <sup>q</sup> for every constraint p ≤ q in D. For each constraint set C , we define an order <sup>C</sup> on resource polynomials by <sup>p</sup> <sup>C</sup> <sup>q</sup> iff <sup>C</sup> <sup>p</sup> <sup>≤</sup> <sup>q</sup>.

[10, Definition 2.3] We define the polarity of occurrences of free resource variables. For a constraint p ≤ q, we say that an occurrence of a resource variable x in p is called *negative*, while the one in q is called *positive*.

#### **3.2 Formulas and Inference Rules of CBLL**

Let A be a set of atomic formulas and assume that each atomic formula a ∈ A is associated with an arity ar(a). Formulas of CBLL are defined by:

$$A, B ::= \quad a(p\_1, \dots, p\_{\text{ar}(a)}) \mid A \otimes B \mid A \multimap B \mid !\_{x < p} A$$

where p in the formula !x<pA satisifes x /∈ FV(p).

[10, Definition 2.6] Each occurrence of a free resource variable in a formula is classified into *positive* or *negative*. Below we inductively define a *positive occurrence* of a resource variable. An occurrence of x in:

**–** a(p1, ··· , par(a)) is always positive.

**–** A ⊗ B is positive iff it is in A and positive, or so in B.

Generalized Bounded Linear Logic and its Categorical Semantics 235

$$\begin{array}{lcl} \frac{A \sqsubseteq \circ \,^{B}B}{A \sqcap \cdot \circ \,^{B}B} \text{ (Ax)} & \frac{\Gamma \vdash \circ \,^{e}A \quad \mathcal{O} \vdash \%}{\Gamma \vdash \circ \,^{B}A} \text{ (Str)} & \frac{\Gamma \vdash \circ \,^{e}B}{\Gamma, !\_{x < 0}A \vdash \circ \,^{e}B} \text{ (!W)}\\ & \frac{A \{0 \prime \cdot \}, \Gamma \vdash \circ \,^{e}B}{!\_{x < 1}A, \Gamma \vdash \circ \,^{e}B} \text{ (!D)} & \frac{\Gamma, !\_{x < p \, A} A, !\_{y < q} A \Big\{ \vdash \circ \,^{t}\_{\notin} B} \text{ (!C)}}{\Gamma, !\_{x < p + q}A \vdash \circ \,^{e}B} \text{ (!C)}\\ & & \frac{A\_{1}, \dots, A\_{n} \vdash\_{\mathcal{C} \cup \{x < p\}} B \quad x \notin \text{ FV} \{\mathcal{O}\}}{!\_{x < p \, A} A\_{1}, \dots, \,^{!\_{x < p \, A} A\_{n} \vdash \circ\_{\mathcal{C}} \,^{!}x\_{\mathcal{O}} \,^{!}A\_{x < p} B} \text{ (!P)} \\\\ & & \frac{!\_{y < p \, !\_{x < q \, \{y \vee w\}} A \left\{ \left( \circ + \sum\_{w < p} \, \,^{g}(w\omega) \right) \Big\}\_{x} \text{ } \Gamma \vdash\_{\mathcal{C}} B}{!\_{x < \sum\_{w < p} q \, \,^{g}(w\omega) \, A} \text{ (!N)}} \text{ (!N)} \end{array}$$

**Fig. 2.** Inference Rules for CBLL (<sup>⊗</sup> and are omitted)


[10, Definition 2.8] We extend the order <sup>C</sup> on resource polynomials to the one on CBLL formulas.

$$\begin{aligned} &a(p\_1, \ldots, p\_{\text{ar}(a)}) \sqsubseteq\_{\mathcal{\ell}^\ell} a(q\_1, \ldots, q\_{\text{ar}(a)}) \text{ iff } \forall i, p\_i \sqsubseteq\_{\mathcal{\ell}^\ell} q\_i \\ &A \otimes B \sqsubseteq\_{\mathcal{\ell}^\ell} C \otimes D \text{ iff } (A \sqsubseteq\_{\mathcal{\ell}^\ell} C) \wedge (B \sqsubseteq\_{\mathcal{\ell}^\ell} D) \\ &A \multimap B \sqsubseteq\_{\mathcal{\ell}^\ell} C \to D \text{ iff } (C \sqsubseteq\_{\mathcal{\ell}^\ell} A) \wedge (B \sqsubseteq\_{\mathcal{\ell}^\ell} D) \\ &\hfil\_{x < p} A \sqsubseteq\_{\mathcal{\ell}^\ell} \mathtt{\kern-1}{\sqsubset\_{\mathcal{\ell}^\ell}} B \text{ iff } (q \sqsubseteq\_{\mathcal{\ell}^\ell} p) \wedge (x \notin \text{FV}(\mathcal{\ell}^\ell)) \wedge (A \sqsubseteq\_{\mathcal{\ell}^\ell} \mathtt{\kern-1}{\surceq\_{\mathcal{\ell}^\ell} B} B) \end{aligned} \tag{3.1}$$

[10, Section 2.3] A CBLL *judgment* is an expression Γ <sup>C</sup> A, where C is a constraint set, Γ is a multiset of formulas and A is a formula. A judgment Γ <sup>C</sup> A means that A is a consequence of Γ under the constraints C .

*Inference rules* (Fig. 2) are almost the same as those of QBAL; we omit the rules for <sup>⊗</sup>, and Cut. Note that weakening is restricted to !-formulas. Every BLL proof of Γ A can be translated to a CBLL proof of Γ <sup>Ø</sup> A.

### **3.3 Translation into GBAL<sup>+</sup>**

As mentioned at the beginning of Section 3, we will give a translation from CBLL to GBAL<sup>+</sup>. When translating a CBLL proof <sup>Γ</sup> <sup>C</sup> <sup>A</sup>, we also need to supply a set F of free resource variables satisfying F ⊇ FV(Γ) ∪ FV(A) ∪ FV(C ). Then the translation of the proof of Γ <sup>C</sup> A yields a proof of V<sup>C</sup> (F) | [Γ] (<sup>F</sup> ;C) [A] (F ;C) in GBAL<sup>+</sup>.

**For Constraints** We define an *environment* over a finite set F of resource variables to be a function from F to N; by V (F) we mean the set of environments over F. Given an environment ρ ∈ V (F) and a resource variable x ∈ F and <sup>n</sup> <sup>∈</sup> <sup>N</sup>, by <sup>ρ</sup>{<sup>x</sup> <sup>→</sup> <sup>n</sup>} we mean the environment over <sup>F</sup> ∪ {x} that extends <sup>ρ</sup> with a mapping x → n. Given a resource polynomial p such that F V (p) ⊆ F, by <sup>p</sup> : <sup>V</sup> (F) <sup>→</sup> <sup>N</sup> we mean the function that evaluates the resource polynomial p under a given environment. For resource polynomials p1, ··· , p<sup>n</sup> such that FV(pi) <sup>⊆</sup> <sup>F</sup>, we give a function <sup>p</sup>1, ··· , p<sup>n</sup> : (<sup>V</sup> (F)) <sup>→</sup> <sup>N</sup><sup>n</sup> by <sup>p</sup>1, ··· , p<sup>n</sup> <sup>ρ</sup> <sup>=</sup> (<sup>p</sup><sup>1</sup>ρ, ··· , <sup>p</sup><sup>n</sup>ρ).

Let <sup>ρ</sup> <sup>p</sup> <sup>≤</sup> <sup>q</sup> denote <sup>p</sup><sup>ρ</sup> <sup>≤</sup> <sup>q</sup><sup>ρ</sup> for a constraint <sup>p</sup> <sup>≤</sup> <sup>q</sup> with a set <sup>F</sup> of free resource variables (such that FV(p) ∪ FV(q) ⊆ F) and for an environment <sup>ρ</sup> <sup>∈</sup> <sup>V</sup> (F). For a subset <sup>S</sup> <sup>⊂</sup> <sup>V</sup> (F) and for a constraint set <sup>C</sup> , <sup>S</sup> <sup>C</sup> is also defined similarly: for every <sup>ρ</sup> <sup>∈</sup> <sup>S</sup> and for every <sup>p</sup> <sup>≤</sup> <sup>q</sup> <sup>∈</sup> <sup>C</sup> , <sup>ρ</sup> <sup>p</sup> <sup>≤</sup> <sup>q</sup>. Given a constraint set C and a set F of resource variables such that FV(C ) ⊆ F, let a set V<sup>C</sup> (F) and a function ιF,<sup>C</sup> : V<sup>C</sup> (F) → V (F) be given by:

$$V\_{\emptyset}(F) \stackrel{\text{def}}{=} \left\{ \rho \in V(F) \mid \rho \models \emptyset \right\}, \qquad \iota\_{F, \emptyset}(\rho) \stackrel{\text{def}}{=} \rho.$$

For a resource polynomial p, a free resource variable x such that x /∈ FV(p), a constraint set C and a set F of resource variables such that FV(p)∪FV(C ) ⊆ F, we introduce a map [x<p](F,C) : V<sup>C</sup> (F) → V<sup>C</sup>∪{x<p}(F ∪ {x})<sup>∗</sup> by

$$\{x < p\}\_{(F; \ell^\circ)} \rho \stackrel{\text{def}}{=} \rho \{x \mapsto 0\}, \rho \{x \mapsto 1\}, \dots, \rho \{x \mapsto (\lceil p \rceil \rho - 1)\}$$

**For Formulas** Given a CBLL formula A, a constraint set C and a set of resource variables F such that F ⊇ FV(A) ∪ FV(C ), the translation [A] (<sup>F</sup> ;C) of a wellformed formula V<sup>C</sup> (F) A is defined inductively as follows:

$$[a(p\_1, \ldots, p\_n)]^{(F; \nwarrow) \text{ def}} \stackrel{\text{def}}{=} [a] \star (\langle p\_1, \ldots, p\_n \rangle \circ \iota\_{F; \nwarrow})$$

$$[A \otimes B]^{(F; \nwarrow) \text{ def}} \stackrel{\text{def}}{=} [A]^{(F; \nwarrow)} \otimes [B]^{(F; \n� G)}$$

$$[A \rightsquigarrow B]^{(F; \nwarrow) \text{ def}} \stackrel{\text{def}}{=} [A]^{(F; \nwarrow)} \rightsquigarrow [B]^{(F; \nwarrow)}$$

$$[!\_{x < p} A]^{(F; \nwarrow) \text{ def}} !\_{[x < p]\_{(F, \nwarrow)}} [A]^{(F \cup \{x\}; \emptyset \cup \{x < p\})}$$

**For Proofs** To give a translation of proofs, we define another notation. For a resource polynomial p, q, a set F of resource variables and a constraint set C such that FV(p) <sup>∪</sup> FV(<sup>C</sup> ) <sup>⊆</sup> <sup>F</sup>, a set [p, q)(F,C) of environments is defined by

$$\{(p,q)^{(F;\ell^{\circ})} = \{\rho \in V(F \cup \{t\}) \mid \rho \models \ell^{\circ}, \llbracket p\rrbracket(\rho) \leq \rho(t) < \llbracket p+q\rrbracket\rho\}$$

here t is a "fresh" resource variable such that t /∈ F.

Given a proof πΓ <sup>C</sup> A, a translation [π] (<sup>F</sup> ;C) V<sup>C</sup> (F) <sup>|</sup> [Γ] (<sup>F</sup> ;C) [A] (F ;C) is defined inductively on the structure of the proof:


$$\begin{array}{lcl} s\_{p,q}^{(F;\ell^{c})}: & V\_{\emptyset}(F) \to [p,q)^{(F;\ell^{c})}\\ & \rho \mapsto \rho\{t \mapsto [p]\rho\}, \cdot \cdot \cdot, \rho\{t \mapsto ([p+q]\rho - 1)\}\\ r\_{p,q}^{(F;\ell^{c})}: & [p,q)^{(F;\ell^{c})} \xrightarrow{\sim} V\_{\emptyset \cup \{y < q\}}(F \cup \{y\})\\ & \rho\{t \mapsto ([p]\rho + k)\} \mapsto \rho\{y \mapsto k\}\\ i\_{1}^{(p,q;F;\ell^{c})}: & V\_{\emptyset \cup \{x < p\}}(F \cup \{x\}) \to V\_{\emptyset \cup \{x < p + q\}}(F \cup \{x\})\\ & \rho\{x \mapsto k\} \mapsto \rho\{x \mapsto k\}\\ i\_{2}^{(p,q;F;\ell^{c})}: & [p,q)^{(F;\ell^{c})} \to V\_{\emptyset \cup \{x < p + q\}}(F \cup \{x\})\\ & \rho\{t \mapsto ([p]\rho + k)\} \mapsto \rho\{x \mapsto [p]\rho + k\} \end{array}$$

They satisfy ![x<p][A] (<sup>F</sup> ∪{x};C∪{x<p}) =![x<p]([A] (<sup>F</sup> ∪{x};C∪{x<p+q}) |<sup>i</sup><sup>1</sup> ) and ![y<q][A{<sup>p</sup>+<sup>y</sup>/<sup>x</sup>}] (<sup>F</sup> ∪{y};C∪{y<q}) =!Jr•<sup>s</sup>([A] (<sup>F</sup> ∪{x};C∪{x<p+q}) |<sup>i</sup>2◦r−<sup>1</sup> ). Then the conclusion of (!C) is obtained:

$$V\_{\vec{\theta}}(F) \mid [I]^{(F;\upharpoonright\theta)}, \upharpoonright\_{(J i\_1 \bullet [x < p]) + (J i\_2 \bullet \ast)} [A]^{(F \cup \{x\}; \upharpoonright \ell \times \{x < p + q\})} \vdash [B]^{(F; \upharpoonright \theta)}.$$

**–** For (!P) rule, let F = F ∪ {x} and C = C ∪ {x<p}. We can prove the translated conclusion from the translated premise by the following proof:

$$\frac{V\_{\boldsymbol{\theta}^{\boldsymbol{\prime}}}(F^{\prime}) \mid [A\_{1}]^{(F^{\prime};\boldsymbol{\theta}^{\boldsymbol{\prime}})}, \dots, [A\_{n}]^{(F^{\prime};\boldsymbol{\theta}^{\boldsymbol{\prime}})} \vdash [B]^{(F^{\prime};\boldsymbol{\theta}^{\boldsymbol{\prime}})}}{n \text{ times (!D)'s } \vdots}$$

$$\frac{V\_{\boldsymbol{\theta}^{\boldsymbol{\prime}}}(F^{\prime}) \mid !\_{\operatorname{id}} [A\_{1}]^{(F^{\prime};\boldsymbol{\theta}^{\boldsymbol{\prime}})}, \dots, \nvdash\_{\operatorname{id}} [A\_{n}]^{(F^{\prime};\boldsymbol{\theta}^{\boldsymbol{\prime}})} \vdash [B]^{(F^{\prime};\boldsymbol{\theta}^{\boldsymbol{\prime}})}}{V\_{\boldsymbol{\theta}^{\boldsymbol{\prime}}}(F) \mid !\_{\operatorname{[}x$$

**–** For (!N) rule, we define index sets Δ0, Δ1, Δ<sup>2</sup> and constraints C0, C1, C<sup>2</sup> by

$$\begin{aligned} \P \theta\_0 &= \theta \cup \{ y < p \} & \Delta\_0 &= V\_{\theta \S} (F \cup \{ y \})\\ \P \theta\_1 &= \theta \cup \{ y < p, z < q \{ y \nmid\_w \} \} & \Delta\_1 &= V\_{\theta \S} (F \cup \{ y, z \})\\ \P \theta\_2 &= \theta \cup \{ x < \sum\_{w < p} q(w) \} & \Delta\_2 &= V\_{\theta \S\_2} (F \cup \{ x \}) \end{aligned}$$

There is an isomorphism r ∈ **Set**(Δ1, Δ2), and it holds an equation [z < <sup>q</sup>{<sup>y</sup>/<sup>w</sup>}](<sup>F</sup> ∪{y},C0) • [y<p](F,C) <sup>=</sup> Jr−<sup>1</sup> • [x < w<p q(w)](F,C). Therefore, (!N) rule can be translated to the following provable judgment:

$$V\_{\boldsymbol{\theta}^\*}(F)|!\_{\left[x<\sum\_{w$$

Since every BLL proof Γ A can be translated to a CBLL proof Γ <sup>Ø</sup> A, it can further be translated to a GBAL<sup>+</sup> proof <sup>V</sup>Ø(F) <sup>|</sup> [Γ] (<sup>F</sup> ;Ø) [A] (<sup>F</sup> ;Ø).

### **4 Categorical Semantics for GBLL**

We give a categorical semantics of GBLL. First, notice that each index set Δ determines a multiplicative linear logic under Δ. We model this situation by a *setindexed symmetric monoidal closed categories*, given by a functor <sup>C</sup> : **Set**op <sup>→</sup> **SMCC**strict. That is, for each Δ ∈ **Set**, a symmetric monoidal closed category CΔ is given, and any function f : Δ → Δ induces a strict symmetric monoidal closed functor Cf : CΔ → CΔ, performing renaming of indexes.

Upon this indexed symmetric monoidal closed categories, we introduce a categorical structure that models the !<sup>f</sup> modality. We call it *indexed linear exponential comonad*. This is a generalization of the *semiring-graded linear exponential comonad* studied in [13,22]. Our generalization replaces the semiring with **Idx**, which may be regarded as a many-object pseudo-semiring (Proposition 2.1).

We write [C, D]<sup>l</sup> for the category of symmetric lax monoidal functors from C to D and monoidal natural transformations between them. We equip it with the pointwise symmetric monoidal structure ( ˙ I, <sup>⊗</sup>˙ ) given by ˙ IX <sup>=</sup> <sup>I</sup> and (<sup>F</sup> <sup>⊗</sup>˙ <sup>G</sup>)<sup>X</sup> <sup>=</sup> F X <sup>⊗</sup> GX for <sup>X</sup> <sup>∈</sup> <sup>C</sup>.

**Definition 4.1.** *An* indexed linear exponential comonad *(ILEC for short) over a set-indexed SMCC* C *consists of:*

**–** *A collection of symmetric colax monoidal functors*

$$(D, w^{\Delta, \Delta'}, c^{\Delta, \Delta'}) : \mathbf{Idx}(\Delta, \Delta') \to [C\Delta', C\Delta]\_l \quad (\Delta, \Delta' \in \mathbf{Set}).$$

*The symmetric lax monoidal structure of* Df *is denoted by* m<sup>f</sup> : I → DfI *and* mf,A,B : DfA ⊗ DfB → Df(A ⊗ B)*.*


The last axiom has two purposes: the equality Cr (DfA) = D(f • Jr )A is to allow reindexing functions to act from outside, and the other equality Df(CrA) = D(Cr • f)A is to make D invariant under internal reindexing of formulas. These equalities are tied up with the formula equivalence in (2.1) and the definition of reindexing at !fA in Definition 2.1, respectively. We postpone a concrete example of ILEC to Section 4.2.

#### **4.1 Semantics of GBLL**

We interpret a well-formed formula <sup>Δ</sup> <sup>A</sup> as an object -<sup>Δ</sup> <sup>A</sup> <sup>∈</sup> CΔ. This is done by induction on the structure of the formula. We assume that each atomic formula a ∈ A(Δ) comes with its interpretation as an object [a] ∈ CΔ.

$$\begin{aligned} \left[\Delta \vdash a \star r\right] & \stackrel{\text{def}}{=} Cr[a] \\ \left[\Delta \vdash A \otimes B\right] & \stackrel{\text{def}}{=} \left[\Delta \vdash A\right] \otimes \left[\Delta \vdash B\right] \quad \left[\Delta \vdash A \multimap B\right] \stackrel{\text{def}}{=} \left[\Delta \vdash A\right] \multimap \left[\Delta \vdash B\right] \end{aligned}$$

$$\begin{array}{c|c|c} D(f\bullet h + g\bullet h)A & \stackrel{\scriptstyle}{\longrightarrow} & D((f+g)\bullet h)A & \quad D0A \Longrightarrow D(0\bullet h)A \\ \downarrow & & \downarrow & & \downarrow \\ D(f\bullet h)A \otimes D(g\bullet h)A & \quad Dh(D(f+g)A) & & \downarrow \\ & \downarrow & & \downarrow & & \downarrow \\ D(Df\bullet)\otimes Dh(DgA) \longrightarrow Dh(Df\bullet\otimes DgA) & I & \quad DhI \\ D(h\bullet f + h\bullet g)A \xrightarrow{D\bullet A} D(h\bullet\bullet\{f+g\}A & & D0A \Longrightarrow D(h\bullet\bullet)A \\ \downarrow & & \downarrow & & \downarrow \\ D(h\bullet f)A \otimes D(h\bullet f)A & & D(f+g)(DhA) & & \quad D0(DhA) \\ \downarrow & & \downarrow & & \downarrow \\ D(f\bullet h\bullet)\otimes Dg(DhA) \Longrightarrow (Df\bullet\otimes Dg)(DhA) & & \quad I & \quad \downarrow \\ DfA \xrightarrow{Df\bullet} D(\mathrm{i}\_{\Delta})(DfA) & & & D(h\bullet\bullet g\bullet f)A & \longrightarrow D(g\bullet f)(DhA) \\ \downarrow & & \downarrow & & \downarrow \\ DfD(\mathrm{i}\_{\Delta})A & & & \downarrow & & \downarrow \\ DfD(\mathrm{i}\_{\Delta})A & & & & DfD(\mathrm{i}\_{\Delta}\bullet g)A \longrightarrow DfD(DhA) \end{array}$$

**Fig. 3.** Axioms of Indexed Linear Exponential Comonad

**Proposition 4.1.** *For any* r ∈ **Set**(Δ, Δ ) *and well-formed formula* Δ A*, we have* -<sup>Δ</sup> <sup>A</sup>|<sup>r</sup> <sup>=</sup> Cr-<sup>Δ</sup> <sup>A</sup>*.*

**Proposition 4.2.** -<sup>Δ</sup> !Jr•<sup>f</sup>A <sup>=</sup> -<sup>Δ</sup> !<sup>f</sup> (A|<sup>r</sup>)*.*

Each proof πΔ <sup>|</sup> <sup>Γ</sup> <sup>A</sup> of GBLL is interpreted as a morphism -<sup>Δ</sup> <sup>|</sup> <sup>Γ</sup> <sup>A</sup> : -<sup>Δ</sup> <sup>Γ</sup> <sup>→</sup> -<sup>Δ</sup> <sup>A</sup> in CΔ. Here, for a sequence <sup>Γ</sup> <sup>=</sup> <sup>C</sup>1, ··· , C<sup>m</sup> of formulas, -<sup>Δ</sup> <sup>Γ</sup> denotes -<sup>Δ</sup> <sup>C</sup><sup>1</sup> ⊗···⊗ -<sup>Δ</sup> <sup>C</sup><sup>m</sup>. We write out the interpretation only for the cases of modalities, because the other rules, Axiom, Exchange, Cut, <sup>⊗</sup>(L, R) and (L, R) are interpreted similarly to the semantics of multiplicative intuitionistic linear logic. Fig. 4 shows the interpretation of rules related to !<sup>f</sup> .

**Theorem 4.1.** *For a proof* πΔ | Γ A*, if* π *has a reducible cut and reduces into* <sup>π</sup> *by a reduction step, then* <sup>π</sup> <sup>=</sup> π *in* CΔ*.*

#### **4.2 Construction of an Indexed Linear Exponential Comonad**

We present a construction of an indexed SMCCs <sup>C</sup> : **Set**op <sup>→</sup> **SMCC**strict and an ILEC D : **Idx**(Δ, Δ ) → [CΔ ,CΔ]<sup>l</sup> over <sup>C</sup> from a SMCC <sup>C</sup>, <sup>⊗</sup>,I,, and a symmetric lax monoidal comonad V,m<sup>V</sup> , m<sup>V</sup> X,Y , , δ on <sup>C</sup>.

**Construction of Indexed SMCCs** First, for each index set Δ, we define the category Δ C to be the product of Δ-many copies of C. We represent objects and morphisms of this category by maps <sup>X</sup> : <sup>Δ</sup> <sup>→</sup> Obj(C) and maps


Here, 1) A denotes <sup>Δ</sup> <sup>A</sup> for each well-formed formula <sup>Δ</sup> <sup>A</sup>. 2) <sup>π</sup> denotes the proof of the premise of each rule.

**Fig. 4.** Interpretations of Modal Rules.

<sup>f</sup> : <sup>Δ</sup> <sup>→</sup> Mor(C), respectively. Since SMCCs are closed under products, <sup>Δ</sup> C is a SMCC by the component-wise tensor product and internal hom:

$$\mathsf{I}(d) \stackrel{\text{def}}{=} I, \quad \mathsf{X} \dot{\otimes} \mathsf{Y}(d) \stackrel{\text{def}}{=} \mathsf{X}(d) \otimes \mathsf{Y}(d), \quad \mathsf{X} \dot{\multimap} \mathsf{Y}(d) \stackrel{\text{def}}{=} \mathsf{X}(d) \lhd \lhd \mathsf{Y}(d)$$

We then define the indexed SMCCs C by CΔ def = Δ C.

**Folding Product** We next introduce the *folding product* functor T; we later compose it with the symmetric lax monoidal comonad V so that we can derive various ILECs over C. Note that T itself *is* also an ILEC; set V = Id. The type of <sup>T</sup> is <sup>Δ</sup><sup>∗</sup> <sup>×</sup> (<sup>Δ</sup> <sup>C</sup>) −→ <sup>C</sup>, and is defined by

$$\mathsf{T}(i\_1 i\_2 \cdots i\_n, \mathsf{A}) \stackrel{\text{def}}{=} \mathsf{A}(i\_1) \otimes \mathsf{A}(i\_2) \otimes \cdots \otimes \mathsf{A}(i\_n), \quad \mathsf{T}((), \mathsf{A}) \stackrel{\text{def}}{=} I$$

On morphisms, T maps a list permutation in the first argument to the symmetry morphism in C. T is symmetric strong monoidal in each argument. Moreover, each strong monoidal structure interacts well with each other, concluding that it becomes a multi-symmetric strong monoidal functor in the sense of [21].

**Proposition 4.3.** *For* f ∈ **Idx**(Δ, Δ ) *and* l = i<sup>1</sup> ···i<sup>k</sup> ∈ Δ∗*, let* f(l) *denote* f(i1)··· f(ik)*. Then it holds* T(f(l), A) T(l,T(f( ), A)) *and this isomorphism is natural for* A*.*

*Remark 4.1.* Usually the !-modal formula !A in linear logic is interpreted by the object consisting of many copies of the same data (referred as *uniformity* of !A [8]). We leave the development of uniform folding product as a future work.

**Construction of ILEC** We now compose the folding product functor with the symmetric lax monoidal comonad V , to derive another ILEC. Let Δ, Δ be index sets. We define a symmetric strong (hence colax) monoidal functor D : **Idx**(Δ, Δ ) −→ [CΔ ,CΔ]<sup>l</sup> by

$$Df\mathsf{A}(i) \stackrel{\text{def}}{=} \mathsf{T}(f(i), V \circ \mathsf{A}) \quad Df\mathsf{p}(i) \stackrel{\text{def}}{=} \mathsf{T}(f(i), V\mathsf{p}) \quad D\alpha\mathsf{A} \stackrel{\text{def}}{=} \mathsf{T}(\alpha, V \circ \mathsf{A}). \tag{4.1}$$

Here, <sup>A</sup> <sup>∈</sup> <sup>Δ</sup> C, and p and α are morphisms in Δ C and **Idx**(Δ, Δ ), respectively. We also define a helper morphism γ<sup>l</sup> <sup>A</sup> : T(l, V ◦ A) → V T(l, A) for (l<sup>1</sup> ··· <sup>l</sup>k) <sup>∈</sup> <sup>Δ</sup><sup>∗</sup> and <sup>A</sup> <sup>∈</sup> <sup>Δ</sup> C. It is the multiple composite of mA,B:

$$V\mathsf{A}(l\_1)\otimes\cdots\otimes V\mathsf{A}(l\_k)\to V\left(\mathsf{A}(l\_1)\otimes\cdots\otimes\mathsf{A}(l\_k)\right).$$

It is routine to verify that this morphism is monoidal natural on l and A.

Two monoidal natural transformations : Di<sup>Δ</sup> → IdΔ<sup>C</sup> and δg,f : D(g•f) → Df ◦ Dg are defined by:

$$
\epsilon\_{\mathsf{A},i} : \mathsf{T}(i, V \circ \mathsf{A}) = V \mathsf{A}(i) \tag{4.2}
$$

$$
\delta\_{g, f; \mathsf{A}; i} : \mathsf{T}((g \bullet f)(i), V \circ \mathsf{A}) \xrightarrow{\sim} \mathsf{T}(f, \mathsf{T}(g(\\_), V \circ \mathsf{A})) $$

$$
\frac{\mathsf{T}(f, \mathsf{T}(g(\\_), \delta\_{\mathsf{A}}))}{} \mathsf{T}(f, \mathsf{T}(g(\\_), V \circ V \circ \mathsf{A})) \xrightarrow{\mathsf{T}(f, \gamma\_{\mathsf{A}}^{\mathcal{S}(\sim)})} Df(Dg \mathsf{A})(i). \tag{4.3}
$$

**Theorem 4.2.** *The symmetric colax monoidal functor* D *(4.1) and monoidal natural transformations* , δ *(4.2,4.3) determine an ILEC over* C*.*

#### **4.3 GBLL Semantics by Realizability Category**

Hofmann et al., and also Dal Lago et al. employ a *realizability semantics* to show that the complexity of BLL proof reductions belongs to P-time [19,10]. In this section we compare their semantics and the simple semantics of GBLL constructed in the previous section.

We instantiate C in the previous section with the realizability category over a BCI algebra (A, ·), which is a combinatory algebra based on B, C, I-combinators; see e.g. [2,20]. We then form the realizability category **Ass**(A) by the following data: an object is a function f into P <sup>+</sup>A, where P <sup>+</sup> is the nonempty powerset construction, and a morphism from f to g is a function h : dom f → dom g with the following property: there exists an element e ∈ A such that for any x ∈ dom f and a ∈ f(x), we have e · a ∈ g(h(x)). The category **Ass**(A) is symmetric monoidal closed; see e.g. [20, Proposition 4]. The tensor product of f and <sup>g</sup> is given by (<sup>f</sup> <sup>⊗</sup> <sup>g</sup>)(x, y) = {<sup>u</sup> <sup>v</sup> <sup>|</sup> <sup>u</sup> <sup>∈</sup> <sup>f</sup>(x), v <sup>∈</sup> <sup>g</sup>(y)}, where <sup>u</sup> <sup>v</sup> is the BCI-algebra element corresponding to λx.xuv [20, Section 2].

Next, let Δ be a set and consider the power category Δ **Ass**(A). Under the axiom of choice, Δ **Ass**(A) is equivalently described as follows: an object is a family of functions {f<sup>i</sup>}<sup>i</sup>∈<sup>Δ</sup> into <sup>P</sup> <sup>+</sup>A, and a morphism from {f<sup>i</sup>}<sup>i</sup>∈<sup>Δ</sup> to {g<sup>i</sup>}<sup>i</sup>∈<sup>Δ</sup> is a family of functions {h<sup>i</sup> : dom f<sup>i</sup> → dom g<sup>i</sup>}<sup>i</sup>∈<sup>Δ</sup> with the following property: there exists a function e : Δ → A such that for any i ∈ Δ, x ∈ dom f<sup>i</sup> and a ∈ fi(x), we have e(i) · a ∈ gi(hi(x)).

This power category is quite close to the realizability category introduced in [19, Section 4] and [10, Section 4]. A membership statement a ∈ fi(x) for an object {f<sup>i</sup>}<sup>i</sup>∈<sup>Δ</sup> <sup>∈</sup> <sup>Δ</sup> **Ass**(A) corresponds to a realizability statement i, a  x in the realizability category (see [19]). The major difference between these categories is twofold: 1) In the realizability category, a computability constraint is imposed on e : Δ → A to achieve the characterization of P-time complexity. 2) Objects in the realizability category are limited to Δ **Ass**(A)-objects such that all f<sup>i</sup> share the common domain. This is to synchronize with the set-theoretic semantics *ignoring* resource polynomials [19, Section 3] [10, Section 3].

We compute the bounded !-modality using the folding product ILEC T with respect to the indexed SMCC ( ) **Ass**(A). Let F be a finite set of variables, x ∈ F be a resource variable, p be a resource polynomial and C be a constraint set under <sup>F</sup>. For any object <sup>X</sup> in <sup>V</sup><sup>C</sup>∪{v≤p}(<sup>F</sup> ∪{v}) **Ass**(A), the folding product <sup>T</sup>([<sup>v</sup> <sup>≤</sup> <sup>p</sup>](F,C), <sup>X</sup>) is an object in <sup>V</sup><sup>C</sup> (F) **Ass**(A) satisfying

$$\begin{aligned} &\Upsilon([v < p]\_{(F;\ell^i)}, \mathsf{X})(i) \\ &= \lambda(x\_0, \dots, x\_{\llbracket p\rrbracket i-1}) \ . \ \{a\_0 \otimes \dots \otimes a\_{\llbracket p\rrbracket i-1} \mid a\_j \in \mathsf{X}(i\{v \mapsto j\})(x\_j)\} \end{aligned}$$

This is different from the modality over the realizability category introduced in [19, Definition 16] and [10, Definition 4.6]:

$$(\mathfrak{l}\_{v$$

it only takes a single argument. This is again because their realizability semantics is designed to synchronize with the set-theoretic semantics ignoring resource polynomials — especially it interprets -!<sup>x</sup>≤<sup>p</sup>A <sup>=</sup> -<sup>A</sup>. On the other hand, the bounded quantification computed in (4.4) does *not* ignore resource polynomials and indexing, as the domain of (4.4) is the index-dependent product <sup>j</sup> dom(X(i{v → j})). From this, we conjecture that the semantics of BLL using the ILEC T over ( ) **Ass**(A) realizes an *index-dependent* set-theoretic semantics of BLL — we leave this semantics as a future work.

#### **5 Conclusion and Related Work**

We introduced GBLL, a generalization of Girard et al.'s BLL. We analyzed the complexity of cut-elimination in GBLL, and gave a translation from CBLL, an extension of BLL with constraints to GBAL<sup>+</sup>. We then introduced ILEC as a categorical structure for interpreting the !-modality of GBLL. The ILEC is a **Idx**-graded linear exponential comonad interacting well with a specified indexed SMCCs. We gave an elementary construction of ILEC using the folding product, and a technique to derive its variants by inserting symmetric monoidal comonads. We gave the semantics of BLL using the folding product on the category of assemblies of a BCI-algebra, and related with the realizability category studied in [19,10].

Girard's BLL has a great influence on the subsequent development of indexed modalities and implicit complexity theory [16]. Hofmann and Scott introduced the realizability technique to BLL and semantically proved that BLL characterizes P-time complexity [19]. Their work was further enriched and studied by Dal Lago and Hofmann [10]. Gaboardi combined the !-modality involving variable binding with PCF and showed that the combined system is relatively complete [24].

Bucciarelli and Ehrhard's *indexed linear logic with exponential* [9] is one of the closest systems to GBLL. However, the type of the !-modality is different: their system derives Δ !fA from Δ A and an *almost injective function* <sup>f</sup> : <sup>Δ</sup> <sup>→</sup> <sup>Δ</sup>; it is a function where each <sup>f</sup> <sup>−</sup><sup>1</sup>(i) is finite. To relate their system and GBLL, let us use the finite powerset construction Pfin and convert f into its inverse <sup>f</sup> <sup>−</sup><sup>1</sup> : <sup>Δ</sup> <sup>→</sup> <sup>P</sup>fin(Δ ). This exhibits the similarity with GBLL: GBLL relaxes Pfin to ( )∗, and takes the inverse as the parameter for the !-modality. The novelty of this work to [9] is that a categorical axiomatization for the !<sup>f</sup> modality is identified as an extension of the graded linear exponential comonads [7,22]. Another novelty is to show that GBLL is enough to encode BLL.

As described in Section 1, the simple form of !-modality !rA is also widely used in various type systems and programming languages. Examples include: INTML [30], coeffect calculus [28,7] and its combination with effect systems [13], Granule language [26], bounded linear type system [14,26], type systems for the analysis of higher-order model-checking [18,17], a generic BLL-like logic B<sup>S</sup> LL over semirings [6], Fuzz type system for function sensitivity and differential privacy [29,12,3], and many more. A combination of !rA with dependent type theory called QTT is also introduced in [25] and [4]. Among these systems, each of [12,26,1] supports 1) full universal and existential, 2) full universal and 3) partial universal quantification over grades, respectively.

The categorical structure corresponding to the simple form of !-modality appears in [7,13,22] and is identified as *semiring-graded linear exponential comonad*. Breuvart constructed various examples of semiring-graded linear exponential comonads on relational models of linear logic [6] using his *slicing* technique. In this work we replaced semirings to **Idx**, which may be seen as a multi-object pseudo-semiring. In the study of graded monad, Orchard et al. generalize the grading structure from ordered monoids to 2-categories [27]. The main difference from this work is that their generalized graded monad is defined over a single category, while an ILEC is defined over an *indexed* SMCCs.

#### **References**

1. Abel, A., Bernardy, J.P.: A unified view of modalities in type systems. Proc. ACM Program. Lang. **4**(ICFP) (Aug 2020). https://doi.org/10.1145/3408972


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

#### **Focused Proof-search in the Logic of Bunched Implications***-*

Alexander Gheorghiu(-) and Sonia Marin(-)

University College London, London, United Kingdom {alexander.gheorghiu.19, s.marin}@ucl.ac.uk

**Abstract.** The logic of Bunched Implications (BI) freely combines additive and multiplicative connectives, including implications; however, despite its well-studied proof theory, proof-search in BI has always been a difficult problem. The focusing principle is a restriction of the proofsearch space that can capture various goal-directed proof-search procedures. In this paper we show that focused proof-search is complete for BI by first reformulating the traditional bunched sequent calculus using the simpler data-structure of nested sequents, following with a polarised and focused variant that we show is sound and complete via a cut-elimination argument. This establishes an operational semantics for focused proofsearch in the logic of Bunched Implications.

**Keywords:** Logic · Proof-search · Focusing · Bunched Implications.

#### **1 Introduction**

The *Logic of Bunched Implications* (BI) [31] is well-known for its applications in systems modelling [32], especially a particular theory (of a variant of BI) called *Separation Logic* [37,23] which has found industrial use in program verification. In this work, we study an aspect of proof search in BI, relying on its well-developed and well-studied proof theory [33]. We show that a goal-directed proof-search procedure known as *focused proof-search* is complete; that is, if there is a proof then there is a focused one. Focused proofs are both interesting in the abstract, giving insight into the proof theory of the logic, and have (for other logics) been a useful modelling technology in applied settings. For example, focused proof-search forms an operational semantics of the DPLL SAT-solvers [14], logic programming [29,1,13,7], automated theorem provers [28], and has been successful in providing a meta-theoretic framework in intuitionistic, substructural, and modal logics [27,30,25].

Syntactically BI combines additive and multiplicative connectives, but unlike related logics such as Linear Logic (LL) [22], BI takes all the connectives as primitive. Indeed, it arose from a proof-theoretic investigation on the relationship between conjunction and implication. As a result, sequents in BI have a

 This work has been partially supported by the UK's EPSRC through research grant EP/S013008/1.

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 247–267, 2021.

https://doi.org/10.1007/978-3-030-71995-1 13

more complicated structure: each implication comes with an associated contextformer. Therefore, in BI contexts are not lists, nor multisets, but instead are *bunches*: binary trees whose leaves are formulas and internal nodes contextformers. Additive composition (Γ; Δ) admits the structural rules of weakening and contraction, whereas multiplicative composition (Γ,Δ) denies them. The principal technical challenges when studying proof-search in BI arise from the interaction between the additive and multiplicative fragments. We overcome these challenges by restricting the application of structural rules in the sequent calculus LBI as well as working with a representation of bunches as nested multisets.

Throughout we use the term *sequent calculus* in a strict sense; that is, meaning a label-free internal sequent calculus, formed in the case of BI by a context (a bunch) and a consequent (a formula). The term *proof-search* is consistently understood to be read as backward reduction within such a system. Although there is an extensive body of research on systems and procedures for semanticsbased calculi in BI [19,20,16,17,18], there has been comparatively little formal study on proof-search in the strict sense. One exception is the completeness result for (unit-simple) uniform proofs [2] which is partially subsumed by the results herein.

The *focusing principle* was introduced for Linear Logic [1] and is characterised by alternating *focused* and *unfocused* phases of goal-directed proof-search. The unfocused phase comprises rules which are safe to apply (i.e. rules where provability is invariant); conversely, the focused phase contains the reduction of a formula and its sub-formulas where potentially invalid sequents may arise, and backtracking may be required. During focused proof-search the unfocused phases are performed eagerly, followed by controlled goal-directed focused phases, until safe reductions are available again. We say that the focusing principle holds when every provable sequent has a focused proof. This alternation can be enforced by a mechanism based on a partition of the set of formulas into two classes, *positive* and *negative*, which correspond to safe behaviour on the left and right respectively; that is, for negative formulas provability is invariant with respect to the application of a right rule, and for positive formulas, of a left rule, but in the other cases the application may result in invalid sequents.

The original proof of the focusing principle in Linear Logic was via long and tedious permutations of rules [1]. In this paper, we use for BI a different methodology, originally presented in [24], which has since been implemented in a variety of logics [25,5,6] and proof systems [13]. The method is as follows: given a sequent calculus, first one polarises the syntax according to the positive/negative behaviours; second, one gives a focused variation of the sequent calculus where the control flow of proof-search is managed by polarisation; third, one shows that this system admits cut (the only non-analytic rule); and, finally, one shows that in the presence of cut the original sequent calculus may be simulated in the focused one. When the polarised system is complete, the focusing principle holds.

In LBI certain rules (the structural rules) have no natural placement in either the focused or the unfocused phases of proof-search. Thus, a design choice must be made: to eliminate/constrain these rules, or to permit them without restriction. The first gives a stricter control proof-search regime, but the latter typically achieves a more well-behaved proof theoretic meta-theory. In this paper, we choose the former as our motivation is to study computational behaviour of proof-search in BI, the latter being recovered by familiar admissibility results. The only case where confinement is not possible is the *exchange* rule. In standard sequent calculi the exchange rule is made implicit by working with a more convenient data-structure such as multisets as opposed to lists; however, the specific structure of bunches in BI means that a more complex alternative is required. The solution presented is to use nested multisets of two types (additive and multiplicative) corresponding to the two different context-formers/conjunctions.

In Section 2 we present the logic of Bunched Implications; in particular, Section 2.1 and Section 2.2 contain the background on BI (the syntax and sequent calculus respectfully); meanwhile, Section 2.3 gives representation of bunches as nested multisets. Section 3 contains the focused system: first, in Section 3.1 we introduce the polarised syntax; second, in Section 3.2 we introduce the focused sequents calculus and some metatheory, most importantly the cut-admissibility result; finally, in Section 3.3 we give the completeness theorem, from which the validity of the focusing principle follows as a corollary. We conclude in Section 4 with some further discussion and future directions.

#### **2 Re-presentations of BI**

#### **2.1 Traditional Syntax**

The logic BI has a well-studied metatheory admitting familiar categorical, algebraic, and truth-functional semantics which have the expected dualities [34,17,33,11,32]. In practice, it is the free combination (or, more precisely, the fibration [15,33]) of intuitionistic logic (IL) and the multiplicative fragment of intuitionistic linear logic (MILL), which imposes the presence of two distinct context-formers in its sequent presentation. That is to say, the two conjunctions ∧ and ∗ are represented at the meta-level by context-formers ; and , in place of the usual commas for IL and MILL respectively.

**Definition 1 (Formula).** *Let* P *be a denumerable set of propositional letters. The* formulas *of BI, denoted by small Greek letters (*ϕ, ψ, χ, . . .*), are defined by the following grammar, where* A ∈ P*,*

ϕ ::= |⊥|<sup>∗</sup> | A | (ϕ ∧ ϕ) | (ϕ ∨ ϕ) | (ϕ → ϕ) | (ϕ ∗ ϕ) | (ϕ −∗ ϕ)

*If* ◦ ∈ {∧, ∨, →, } *then it is an additive connective and if* ◦ ∈ {∗, −∗, <sup>∗</sup>} *then it is a multiplicative connective. The set of all formulas is denoted* F*.*

**Definition 2 (Bunch).** *A* bunch *is constructed from the following grammar, where* <sup>ϕ</sup> <sup>∈</sup> <sup>F</sup>*,*

$$\Delta ::= \varphi \mid \mathcal{Q}\_+ \mid \mathcal{Q}\_\times \mid (\Delta; \Delta) \mid (\Delta, \Delta)$$

*The symbols* <sup>∅</sup><sup>+</sup> *and* <sup>∅</sup><sup>×</sup> *are the additive and multiplicative units respectively, and the symbols* ; *and* , *are the additive and multiplicative context-formers respectively. A bunch is* basic *if it is a formula,* <sup>∅</sup>+*, or* <sup>∅</sup><sup>×</sup> *and* complex *otherwise. The set of all bunches is denoted* B*, the set of complex bunches with additive root context-former by* B<sup>+</sup>*, and the set of complex bunches with multiplicative root context-former by* B×*.*

For two bunches Δ, Δ <sup>∈</sup> <sup>B</sup> if <sup>Δ</sup> is a sub-tree of <sup>Δ</sup>, it is called a *sub-bunch*. We may use the standard notation Δ(Δ ) (despite its slight inpracticality) to denote that Δ is a sub-bunch of Δ, in which case Δ(Δ) is the result of replacing the occurrence of Δ by Δ. If δ is a sub-bunch of Δ, then the context-former ◦ is said to be its principal context-former in Δ(Δ ◦ δ) (and Δ(δ ◦ Δ )).

*Example 3.* Let <sup>ϕ</sup>, <sup>ψ</sup> and <sup>χ</sup> be formulas, and let <sup>Δ</sup> = (ϕ,(χ; <sup>∅</sup>+)); (ψ; (ψ; <sup>∅</sup>×)). The bunch may be written for example as Δ(ϕ,(χ; ∅+)) which means that we can have <sup>Δ</sup>(ϕ; <sup>ϕ</sup>)=(ϕ; <sup>ϕ</sup>); (ψ; (ψ; <sup>∅</sup>×)).

**Definition 4 (Bunched Sequent).** *A bunched sequent is a pair of a bunch* Δ*, called the context, and a formula* ϕ*, denoted* Δ ⇒ ϕ*.*

Bunches are intended to be considered up-to *coherent equivalence* (≡). It is the least relation satisfying:


It will be useful to have a measure on sub-bunches which can identify their distance from the root node.

**Definition 5 (Rank).** *If* Δ *is a sub-bunch of* Δ*, then* ρ(Δ ) *is the number of alternations of additive and multiplicative context-formers between the principal context-former of* Δ *, and the root context-former of* Δ*.*

Let Δ be a complex bunch, we use Δ ∈ Δ to denote that Δ is a (proper) top-most sub-bunch; that is, Δ is a sub-bunch satisfying Δ = Δ but ρ(Δ ) = 0.

*Example 6.* Let <sup>Δ</sup> be as in Example 3, then <sup>ρ</sup>(∅+) = 2 whereas <sup>ρ</sup>(∅×) = 0; hence, <sup>ψ</sup>, <sup>∅</sup><sup>×</sup> and (ϕ,(χ, <sup>∅</sup>×)) <sup>∈</sup> <sup>Δ</sup>. Consider the parse-tree of <sup>Δ</sup>:

Reading upward from ∅<sup>+</sup> one encounters first ; which changes into , and then back to ; so the rank is 2; whereas counting up from <sup>∅</sup><sup>×</sup> one only encounters ; so the rank is 0.

<sup>A</sup> <sup>⇒</sup> <sup>A</sup> Ax <sup>Δ</sup>(⊥) <sup>⇒</sup> <sup>ϕ</sup> <sup>⊥</sup><sup>L</sup> <sup>∅</sup><sup>×</sup> ⇒ <sup>∗</sup> <sup>∗</sup> <sup>R</sup> <sup>∅</sup><sup>+</sup> ⇒ <sup>R</sup> Δ- ⇒ ϕ Δ(Δ--, ψ) ⇒ χ Δ(Δ- , Δ--, ϕ −∗ <sup>ψ</sup>) <sup>⇒</sup> <sup>χ</sup> −∗<sup>L</sup> Δ, ϕ ⇒ ψ <sup>Δ</sup> <sup>⇒</sup> <sup>ϕ</sup> −∗ <sup>ψ</sup> −∗<sup>R</sup> Δ(ϕ, ψ) ⇒ χ <sup>Δ</sup>(<sup>ϕ</sup> <sup>∗</sup> <sup>ψ</sup>) <sup>⇒</sup> <sup>χ</sup> <sup>∗</sup><sup>L</sup> Δ ⇒ ϕ Δ- ⇒ ψ Δ, Δ- <sup>⇒</sup> <sup>ϕ</sup> <sup>∗</sup> <sup>ψ</sup> <sup>∗</sup><sup>R</sup> <sup>Δ</sup>(∅×) <sup>⇒</sup> <sup>χ</sup> Δ(<sup>∗</sup>) ⇒ χ ∗ L Δ(ϕ; ψ) ⇒ χ Δ(ϕ ∧ ψ) ⇒ χ ∧<sup>L</sup> Δ ⇒ ϕ Δ- ⇒ ψ Δ; Δ- ⇒ ϕ ∧ ψ ∧<sup>R</sup> <sup>Δ</sup>(∅+) <sup>⇒</sup> <sup>χ</sup> Δ() ⇒ χ <sup>L</sup> Δ(ϕ) ⇒ χ Δ(ψ) ⇒ χ Δ(ϕ ∨ ψ) ⇒ χ ∨<sup>L</sup> Δ ⇒ ϕ Δ ⇒ ϕ ∨ ψ ∨<sup>R</sup><sup>1</sup> Δ ⇒ ψ Δ ⇒ ϕ ∨ ψ ∨<sup>R</sup><sup>2</sup> Δ- ⇒ ϕ Δ(Δ--; ψ) ⇒ χ Δ(Δ- ; Δ--; ϕ → ψ) ⇒ χ <sup>→</sup><sup>L</sup> <sup>Δ</sup>; <sup>ϕ</sup> <sup>⇒</sup> <sup>ψ</sup> Δ ⇒ ϕ → ψ →<sup>R</sup> Δ(Δ- ; Δ- ) ⇒ χ Δ(Δ- ) ⇒ χ C Δ(Δ- ) ⇒ χ Δ(Δ- ; Δ--) ⇒ χ W Δ ⇒ χ Δ- ⇒ χ <sup>E</sup>(Δ≡Δ-) Δ- ⇒ ϕ Δ(ϕ) ⇒ χ Δ(Δ- ) ⇒ χ cut

**Fig. 1.** Sequent Calculus LBI

#### **2.2 Sequent Calculus**

The proof theory of BI is well-developed including familiar Hilbert, natural deduction, sequent calculi, tableaux systems, and display calculi [33,17,3]. In the foregoing we restrict attention to the sequent calculus as it more amenable to studying proof-search as computation, having local correctness while enjoying the completeness of analytic proofs.

**Definition 7 (System** LBI**).** *The bunched sequent calculus* LBI *is composed of the rules in Figure 1.*

The classification of ∧ as additive may seem dubious upon reading the ∧<sup>R</sup> rule, but the designation arises from the use of the structural rules; that is, the ∧<sup>R</sup> and →<sup>R</sup> rules may be replaced by *additive* variants without loss of generality. The presentation in Figure 1 is as in [33] and simply highlights the nature of the additive and multiplicative context-formers. Nonetheless, the choice of rule does affect proof-search behaviours, and the consequences are discussed in more detailed in Section 3.1.

**Lemma 8 (Cut-elimination).** *If* ϕ *has a* LBI*-proof, then it has a* cut*-free* LBI*proof, i.e., a proof with no occurence of the* cut *rule.*

Throughout, unless specified otherwise, we take proof to mean cut-free proof. Moreover, if L is a sequent calculus we use <sup>L</sup> Δ ⇒ ϕ to denote that there is an L-proof of Δ ⇒ ϕ. Further, if R is a rule, then we may denote L + R to denote the sequent calculus combining the rules of L with R.

The following result, that a generalised version of the axiom is derivable in LBI, will allow for such sequents to be used in proof-construction later on.

**Lemma 9.** *For any formula* ϕ*,* LBI ϕ ⇒ ϕ*.*

*Proof.* Follows from induction on size of ϕ.

The remainder of this section is the meta-theory required to control the structural rules, which pose the main issue to the study of proof-search in BI.

**Lemma 10.** *The following rules are derivable in* LBI*, and replacing* W *with them does not affect the completeness of the system.*

$$\begin{array}{c} \overline{\Delta;A\Rightarrow A} \quad \mathsf{A\ltimes'} \quad \overline{\Delta;\mathscr{Q}\_{\times}\Rightarrow\top^{\*}} \quad \top^{\*\prime}\_{\mathsf{R}} \quad \overline{\Delta;\mathscr{Q}\_{+}\Rightarrow\top} \quad \top^{\prime}\_{\mathsf{R}}\\ \frac{\Delta\Rightarrow\varphi\quad\Delta^{\prime}\Rightarrow\psi}{(\Delta,\Delta^{\prime});\Delta^{\prime\prime}\Rightarrow\varphi\*\psi} \; \ast^{\prime}\_{\mathsf{R}} \quad \frac{\Delta^{\prime}\Rightarrow\varphi\quad\Delta(\Delta^{\prime\prime},\psi)\Rightarrow\chi}{\Delta(\Delta^{\prime},\Delta^{\prime\prime},(\Delta^{\prime\prime\prime};\varphi\twoheadrightarrow\psi))\Rightarrow\chi} \; \ast^{\prime}\_{\mathsf{L}} \end{array}$$

*Proof.* We can construct in LBI derivations with the same premisses and conclusion as these rules by use of the structural rules. Let LBI be LBI without W but with these new rules (retaining also ∗R, −∗L, <sup>∗</sup> <sup>R</sup>, <sup>R</sup>, and Ax), then W is admissible in LBI using standard permutation argument.

One may regard the above modification to LBI as forming a new calculus, but since all the new rules are derivable it is really a restriction of the calculus, in the sense that all proofs in the new system have equivalent proofs in LBI differing only by explicitly including instances of weakening.

#### **2.3 Nested Calculus**

Originally, sequents in the calculi for classical and intuitionistic logics (LK and LJ, respectively) were introduced as lists, and a formal *exchange* rule was required to permute elements when needed for a logical rule to be applied [21]. However, in practice, the exchange rule is often suppressed, and contexts are simply presented as multisets of formulas. This reduces the number of steps/choices being made during proof-search without increasing the complexity of the underlying data structure. Bunches have considerably more structure than lists, but a quotient with respect to coherent equivalence can be made resulting in two-sorted nested multisets; this was first suggested in [12], though never formally realised.

**Definition 11 (Two-sorted Nest).** *Nests* (Γ) *are formulas or multisets, ascribed either additive* (Σ)*, or multiplicative* (Π) *kind, containing nests of the opposite kind:*

$$\Gamma := \Sigma \mid \Pi \qquad \Sigma := \varphi \mid \{\Pi\_1, \dots, \Pi\_n\}\_+ \qquad \Pi := \varphi \mid \{\Sigma\_1, \dots, \Sigma\_n\}\_\times$$

*The constructors are multiset constructors which may be empty in which case the nests are denoted* <sup>∅</sup><sup>+</sup> *and* <sup>∅</sup><sup>×</sup> *respectively. No multiset is a singleton; and the set of all nests is denoted* B/≡*.*

Given nests Λ and Γ, we write Λ ∈ Γ to denote either that Λ = Γ, if Γ is a formula, or that Λ is an element of the multiset Γ otherwise. Furthermore, we write <sup>Λ</sup> <sup>⊆</sup> <sup>Γ</sup> to denote <sup>∀</sup><sup>γ</sup> <sup>∈</sup> <sup>B</sup>/<sup>≡</sup> if <sup>γ</sup> <sup>∈</sup> <sup>Λ</sup> then <sup>γ</sup> <sup>∈</sup> <sup>Γ</sup>.

We will depart from the standard, yet impractical subbunch notation, and adopt a context notation for nests instead. We write Γ{·}<sup>+</sup> (resp. Γ{·}×) for a nest with a hole within one of its additive (resp. multiplicative) multisets.The notation Γ{Λ}<sup>+</sup> (resp. Γ{Λ}×), denotes that Λ is a sub-nest of Γ of additive (resp. multiplicative) kind; we may use Γ{Λ} when the kind is not specified. In either case Γ{Λ } denotes the substitution of Λ for Λ . A promotion in the syntax tree may be required after a substitution either to handle a singleton or an improper alternation of constructor types.

*Example 12.* The following inclusions are valid,

$$\{\varphi,\chi\}\_{\chi} \in \left\{\{\varphi,\chi\}\_{\chi},\psi\right\}\_{+} \subseteq \left\{\{\varphi,\chi\}\_{\chi},\psi,\psi,\mathcal{D}\_{\chi}\right\}\_{+} = \Gamma\{\{\varphi,\chi\}\_{\chi}\}\_{+} $$

It follow that <sup>Γ</sup>{{ϕ,ϕ}+}<sup>+</sup> <sup>=</sup> { ϕ,ϕ,ψ,ψ, <sup>∅</sup><sup>×</sup> }+. Note the absence of the {·}<sup>+</sup> constructor after substitution, this is due to a promotion in the syntax tree to avoid having two nested additive constructors. Similarly, since <sup>∅</sup><sup>×</sup> denotes the empty multiset of multiplicative kind, substituting χ with it gives {ϕ, ψ , ψ , <sup>∅</sup><sup>×</sup> }+; that is, first the improper {ϕ, <sup>∅</sup>×}<sup>×</sup> becomes {ϕ}×; then, the resulting singleton {ϕ}<sup>×</sup> is promoted to ϕ.

Typically we will only be interested in fragments of sub-nests so we have the following abuse of notation, where ◦∈{+, ×}:

$$\Gamma\{\{\Pi\_1,\dots,\Pi\_i\}\_\diamond,\Pi\_{i+1},\dots,\Pi\_n\}\_\diamond := \Gamma\{\Pi\_1,\dots,\Pi\_n\}\_\diamond.$$

The notion of rank has a natural analogue in this setting.

**Definition 13 (Depth, Rank).** *Let* ◦∈{+ , ×} *be a nest, we define the depth on* B *as follows:*

$$\delta(\varphi) := 0 \qquad \delta(\{F\_1, \dots, F\_n\}\_{\diamond}) := \max\{\delta(F\_1), \dots, \delta(F\_n)\} + 1$$

The equivalence of the two presentations, bunches and nests, follows from a moral (in the sense that bunches are intended to be considered modulo congruence) inverse between a *nestifying* function η and a *bunching* function β. The transformation β is simply going from a tree with arbitrary branching to a binary one, and η is the reverse.

**Definition 14 (Canonical Translation).** *The canonical translation* <sup>η</sup> : <sup>B</sup> <sup>→</sup> B/<sup>≡</sup> *is defined recursively as follows,*

$$\eta(\Delta) \coloneqq \begin{cases} \Delta & \text{if } \Delta \in \mathbb{F} \cup \{\mathcal{Q}\_+, \mathcal{Q}\_\times\} \\ \{\eta(\Delta') \in \mathbb{B} \mid \equiv \mid \rho(\Delta') = 1 \text{ and } \Delta' \in \mathbb{B}^\times\}\_+ & \text{if } \Delta \in \mathbb{B}^+ \\ \{\eta(\Delta') \in \mathbb{B} \mid \equiv \mid \rho(\Delta') = 1 \text{ and } \Delta' \in \mathbb{B}^+\}\_\times & \text{if } \Delta \in \mathbb{B}^\times \end{cases}$$

*The canonical translation* <sup>β</sup> : <sup>B</sup>/<sup>≡</sup> <sup>→</sup> <sup>B</sup> *is defined recursively as follows,*

$$\beta(\varGamma) := \begin{cases} \varGamma & \text{if } \varGamma \in \mathbb{F} \cup \{ \mathcal{Q}\_+, \mathcal{Q}\_\times \} \\ \beta(\varPi\_1); (\beta(\varPi\_2); \dots) & \text{if } \varGamma = \{ \varPi\_1, \varPi\_2, \dots \} + \\ \beta(\varSigma\_1), (\beta(\varSigma\_2), \dots) & \text{if } \varGamma = \{ \Sigma\_1, \Sigma\_2, \dots \} \times \end{cases}$$

*Example 15.* Applying η to the bunch in Example 3 gives the nest in Example 12:

**Lemma 16.** *The translations are inverses up-to congruence; that is,*


*Proof.* The first two statements follow by induction on the depth (either for bunches or nests), where one must take care to consider the case of a context consisting entirely of units. The third statement employs the first in the forward direction, and proceeds by induction on depth in the reverse direction.

**Definition 17 (System** ηLBI**).** *The nested sequent calculus* ηLBI *is composed of the rules in Figure 2, where the metavariables denote possibly empty nests.*

Observe the use of metavariable Γ instead of Π (resp. Σ) as sub-contexts in Figure 2. This allows classes of inferences such as

$$\frac{\{\Sigma\_0, \dots, \Sigma\_i\}\_\times \Rightarrow \varphi \quad \{\Sigma\_{i+1}, \dots, \Sigma\_n\}\_\times \Rightarrow \varphi \quad \ast \mathsf{R}}{\{\Sigma\_0, \dots, \Sigma\_n\}\_\times \Rightarrow \varphi \ast \psi} \; \ast \mathsf{R}$$

to be captured by a single figure. In practice it implements the abuse of notation given above:

{{Σ0, ..., Σ<sup>i</sup>}×, {Σ<sup>i</sup>+1, ..., Σ<sup>n</sup>}×}<sup>×</sup> ⇒ ϕ ∗ ψ

This system is a new and very convenient presentation of LBI, not *per se* a development of the proof theory for the logic.

**Lemma 18 (Soundness and Completeness of** ηLBI**).** *Systems* LBI *and* ηLBI *are equivalent:*

$$So \text{\textquotedblleft}ness\text{\textquotedblright} If \vdash\_{\eta \text{\textquotedblleft}\text{\textquotedblright}} \Gamma \Rightarrow \varphi \text{\textquotedblleft} then \vdash\_{\mathsf{\textquotedblleft}\text{\textquotedblright}} \beta(\Gamma) \Rightarrow \varphi;\\Completeness: \text{\textquotedblleft} \vdash\_{\mathsf{\textquotedblleft}\text{\textquotedblleft}\text{\textquotedblright}} \Delta \Rightarrow \varphi \text{\textquotedblleft} then \vdash\_{\eta \text{\textquotedblleft}\text{\textquotedblright}} \eta(\Delta) \Rightarrow \varphi. \text{\textquotedblleft}$$

*Proof.* Each claim follows by induction on the context, appealing to Lemma 16 to organise the data structure for the induction hypothesis, without loss of generality.

{Γ, A}<sup>+</sup> <sup>⇒</sup> <sup>A</sup> Ax <sup>Γ</sup>{⊥} ⇒ <sup>χ</sup> <sup>⊥</sup><sup>L</sup> <sup>∅</sup><sup>×</sup> ⇒ <sup>∗</sup> <sup>∗</sup> <sup>R</sup> <sup>Γ</sup> ⇒ <sup>R</sup> Γ- ⇒ ϕ Γ{Γ--, ψ}<sup>×</sup> ⇒ χ Γ{Γ- , Γ--, {Γ---, ϕ −∗ <sup>ψ</sup>}+}<sup>×</sup> <sup>⇒</sup> <sup>χ</sup> −∗<sup>L</sup> {Γ, ϕ}<sup>×</sup> ⇒ ψ <sup>Γ</sup> <sup>⇒</sup> <sup>ϕ</sup> −∗ <sup>ψ</sup> −∗<sup>R</sup> Γ{{ϕ, ψ}×} ⇒ χ <sup>Γ</sup>{<sup>ϕ</sup> <sup>∗</sup> <sup>ψ</sup>} ⇒ <sup>χ</sup> <sup>∗</sup><sup>L</sup> Γ ⇒ ϕ Γ- ⇒ ψ {{Γ, Γ- }<sup>×</sup> , Γ--}<sup>+</sup> <sup>⇒</sup> <sup>ϕ</sup> <sup>∗</sup> <sup>ψ</sup> <sup>∗</sup><sup>R</sup> <sup>Γ</sup>{∅×} ⇒ <sup>χ</sup> Γ{<sup>∗</sup>} ⇒ χ ∗ L Γ{{ϕ, ψ}+} ⇒ χ Γ{ϕ ∧ ψ} ⇒ χ ∧<sup>L</sup> Γ ⇒ ϕ Γ ⇒ ψ <sup>Γ</sup> <sup>⇒</sup> <sup>ϕ</sup> <sup>∧</sup> <sup>ψ</sup> <sup>∧</sup><sup>R</sup> <sup>Γ</sup>{∅+} ⇒ <sup>χ</sup> Γ{} ⇒ χ <sup>L</sup> Γ{ϕ} ⇒ χ Γ{ψ} ⇒ χ Γ{ϕ ∨ ψ} ⇒ χ ∨<sup>L</sup> Γ ⇒ ϕ Γ ⇒ ϕ ∨ ψ ∨<sup>R</sup><sup>1</sup> Γ ⇒ ψ Γ ⇒ ϕ ∨ ψ ∨<sup>R</sup><sup>2</sup> Γ- ⇒ ϕ Γ{Γ- , ψ}<sup>+</sup> ⇒ χ Γ{Γ- , ϕ → ψ}<sup>+</sup> ⇒ χ <sup>→</sup><sup>L</sup> {Γ, ϕ}<sup>+</sup> <sup>⇒</sup> <sup>ψ</sup> Γ ⇒ ϕ → ψ →<sup>R</sup> Γ{Γ- , Γ- }<sup>+</sup> ⇒ χ Γ{Γ- }<sup>+</sup> ⇒ χ C

**Fig. 2.** Sequent Calculus ηLBI

*Example 19.* The following is a proof in ηLBI.

$$\begin{array}{llll} \overline{A \Rightarrow A} & \mathsf{Ax} & \overline{\{B, C\} \mathrel{+} \Rightarrow B} & \mathsf{Ax} \\ \hline \overline{\{A, \{B, C\}\} \mathrel{+} \Rightarrow A \ast B} & \mathsf{\*}\_{\mathsf{R}} & \overline{A \Rightarrow A} & \overline{\{B, C\} \mathrel{+} \Rightarrow C} & \mathsf{Ax} \\ \hline \overline{\{A, (B \land C)\} \mathrel{} \Rightarrow A \ast B} & \mathsf{A} \ast B & \overline{\{A, (B \land C)\} \mathrel{{}} \Rightarrow A \ast C} & \mathsf{\*}\_{\mathsf{R}} \\ \hline & & \overline{\{A, (B \land C)\} \mathrel{{}} \Rightarrow (A \ast B) \wedge (A \ast C)} & \mathsf{\*}\_{\mathsf{R}} & \\ & & \overline{\{A \ast (B \land C) \Rightarrow (A \ast B) \wedge (A \ast C)} & \mathsf{\*}\_{\mathsf{L}} \\ \hline & & \overline{\mathcal{B} \times} \Rightarrow (A \ast (B \land C)) \to ((A \ast B) \wedge (A \ast C)) & \\ \end{array}$$

We expect no obvious difficulty in studying focused proof-search with bunches instead of nested multisets; the design choice is simply to reduce the complexity of the argument by pushing all uses of exchange (E) to Lemma 18, rather than tackle it at the same time as focusing itself. In particular, working without the nested system would mean working with a weaker notion of focusing since the exchange rule must then be permissible during both focused and unfocused phases of reduction.

#### **3 A Focused System**

At no point in this section will we refer to bunches, thus the variable Δ, so far reserved for elements of B, is re-appropriated as an alternative to Γ.

#### **3.1 Polarisation**

Polarity in the focusing principle is determined by the invariance of provability under application of a rule, that is, by the proof rules themselves. One way the distinction between positive and negative connectives is apparent is when their rule behave either *synchronously* or *asynchronously*. For example, the ∗<sup>R</sup> and −∗<sup>L</sup> highlight the synchronous behaviour of the multiplicative connectives since the structure of the context affects the applicability of the rule. Displaying such a synchronous behaviour on the left makes −∗ a negative connective, while having it on the right makes ∗ a positive connective.

Another way to characterise the polarity of a connective is the study of the inveribility properties of the corresponding rules. For example, consider the inverses of the ∨<sup>L</sup> rule,

$$\frac{\Gamma\{\varphi \lor \psi\} \Rightarrow \chi}{\Gamma\{\varphi\} \Rightarrow \chi} \; \psi\_{\mathsf{L}1}^{\mathsf{inv}} \qquad \qquad \frac{\Gamma\{\varphi \lor \psi\} \Rightarrow \chi}{\Gamma\{\psi\} \Rightarrow \chi} \; \psi\_{\mathsf{L}2}^{\mathsf{inv}}$$

They are derivable in LBI with cut (below – the left branch being closed using Lemma 9) and therefore admissible in LBI without cut (by Lemma 8).

$$\begin{array}{c} \frac{\varphi \Rightarrow \varphi}{\varphi \Rightarrow \varphi \lor \psi} \; \mathsf{\prescript{}{}{}{\mathsf{R}}} \; \; \; \; \Gamma \{ \varphi \lor \psi \} \Rightarrow \chi \\ \hline \hline \Gamma \{ \varphi \} \Rightarrow \chi \end{array} \begin{array}{c} \begin{array}{c} \psi \Rightarrow \psi \\ \overline{\psi \Rightarrow \varphi \lor \psi} \; \; \; \; \; \; \; \; \; \Gamma \{ \varphi \lor \psi \} \Rightarrow \chi \\ \hline \Gamma \{ \psi \} \Rightarrow \chi \end{array} \; \; \; \; \; \psi \end{array}$$

This means that provability is invariant in general upon application of ∨<sup>L</sup> since it can always be reverted if needed, as follows

$$\frac{\frac{\Gamma\{\varphi \lor \psi\} \Rightarrow \chi}{\Gamma\{\varphi\} \Rightarrow \chi} \stackrel{\text{inv}}{\rightsquigarrow} \frac{\Gamma\{\varphi \lor \psi\} \Rightarrow \chi}{\Gamma\{\psi\} \Rightarrow \chi} \stackrel{\text{inv}}{\rightsquigarrow} \chi\_{\mathsf{L}}^{\mathsf{inv}}$$

Note however that dual connectives do not necessarily have dual behaviours in terms of provability invariance, on the left and on the right. For example, consider all the possible rules for ∧, of which some qualify as positive and others as positive.

$$\frac{\Gamma\{\varphi\} \Rightarrow \chi}{\Gamma\{\varphi \land \psi\} \Rightarrow \chi} \land\_{\mathsf{L1}}^{-} \frac{\Gamma\{\psi\} \Rightarrow \chi}{\Gamma\{\varphi \land \psi\} \Rightarrow \chi} \land\_{\mathsf{L2}}^{-} \qquad \qquad \frac{\Gamma \Rightarrow \varphi \quad \Gamma \Rightarrow \psi}{\Gamma \Rightarrow \varphi \land \psi} \land\_{\mathsf{R}}^{-}$$

$$\frac{\Gamma\{\{\varphi, \psi\}\_{+}\} \Rightarrow \chi}{\Gamma\{\varphi \land \psi\} \Rightarrow \chi} \land\_{\mathsf{L}}^{+} \qquad \qquad \frac{\Gamma \Rightarrow \varphi \quad \Gamma\{\{\varphi \land \psi\}\_{+}\} \Rightarrow \chi}{\{\Gamma, \Delta\}\_{+} \Rightarrow \varphi \land \psi} \land\_{\mathsf{R}}^{+}$$

All of these rules are sound, and replacing the conjunction rules in LBI with any pair of a left and right rule will result in a sound and complete system. Indeed, the rules are inter-derivable when the structural rules are present, but otherwise they can be paired to form two sets of rules which have essentially different proof-search behaviours. That is, the rules in the top-row make ∧ negative while the bottom row make ∧ positive. Each conjunction also comes with an associated unit, that is, <sup>−</sup> for negative conjunctio and <sup>+</sup> for positive conjunction. We choose to add all of them to our system in order to have access to those different proof search behaviours at will.

Finally, the polarity of the propositional letters can be assigned arbitrarily as long as only once for each.

**Definition 20 (Polarised Syntax).** *Let* <sup>P</sup><sup>+</sup> <sup>P</sup><sup>−</sup> *be a partition of* <sup>P</sup>*, and let* <sup>A</sup><sup>+</sup> <sup>∈</sup> <sup>P</sup><sup>+</sup> *and* <sup>A</sup><sup>−</sup> <sup>∈</sup> <sup>P</sup>−*, then the polarised formulas are defined by the following grammar,*

$$\begin{aligned} P,Q &::= L \mid P \lor Q \mid P \ast Q \mid P \wedge^+ Q \mid \top^+ \mid \top^\* \mid \bot & & L ::= \downarrow N \mid A^+ \\ N,M &::= R \mid P \to N \mid P \star N \mid N \wedge^- M \mid \top^- & & R ::= \uparrow P \mid A^- \end{aligned}$$

*The set of positive formulas* P *is denoted* F<sup>+</sup>*; the set of negative formulas* N *is denoted* F−*; and the set of all polarised formulas is denoted* F±*. The subclassifications* L *and* R *are left-neutral and right-neutral formulas respectfully.*

The shift operators have no logical meaning; they simply mediate the exchange of polarity, and thus the *shifting* into a new phase of proof-search. Consequently, to reduces cases in subsequent proofs, we will consider formulas of the form ↑↓N and ↓↑P, but not ↓↑↓N, ↓↑↓↑P, etc.

**Definition 21 (Depolarisation).** *Let* ◦ ∈ {∨ , <sup>∗</sup> , <sup>→</sup> , −∗}*, and let* <sup>A</sup><sup>+</sup> <sup>∈</sup> <sup>P</sup><sup>+</sup> *and* <sup>A</sup><sup>−</sup> <sup>∈</sup> <sup>P</sup>−*, then the depolarisation function* \$·% : <sup>F</sup><sup>±</sup> <sup>→</sup> <sup>F</sup> *is defined as follows:*

$$\begin{array}{llll} \left[ \top^{+} \right] := \left[ \top^{-} \right] := \top & \left[ \bot \right] := \bot & \left[ \top^{\*} \right] := \top^{\*}\\ \left[ A^{+} \right] := \left[ A^{-} \right] := A & \left[ \uparrow \varphi \right] := \left[ \downarrow \varphi \right] := \left[ \varphi \right] \\ \left[ \varphi \circ \psi \right] := \left[ \varphi \right] \circ \left[ \psi \right] & \left[ \varphi \wedge^{+} \psi \right] := \left[ \varphi \wedge^{-} \psi \right] := \left[ \varphi \wedge^{-} \psi \right] := \left[ \varphi \right] \wedge \left[ \psi \right] \end{array}$$

Since proof-search is controlled by polarity, the construction of sequents in the focused system must be handled carefully to avoid ambiguity.

**Definition 22 (Polarised Sequents).** Positive *and* neutral *nests, denoted by* Γ *and* −→Γ *resp., are defined according to the following grammars*

$$\begin{array}{llll} \Gamma := \Sigma & \varPi & \Sigma := P \mid \{\Pi\_1, ..., \Pi\_n\}\_+ & \varPi := P \mid \{\Sigma\_1, ..., \Sigma\_n\}\_\times \\ \overline{\varPi} := \overline{\Sigma} & \overline{\varPi} & \overline{\varSigma} := L \mid \{\overline{\varPi}\_1, ..., \overline{\varPi}\_n\}\_+ & \overline{\varPi} := L \mid \{\Sigma\_1, ..., \Sigma\_n\}\_\times \end{array}$$

*A pair of a polarised nest and a polarised formula is a* polarised sequent *if it falls into one of the following cases*

$$
\Gamma \Rightarrow N \quad \mid \quad \overline{I}^{\flat} \Rightarrow \langle P \rangle \quad \mid \quad \overline{I}^{\sharp} \{ \langle N \rangle \} \Rightarrow R
$$

The decoration ϕ indicates that the formula is in focus; that is, it is a positive formula on the right, or a negative formula on the left. Of the three possible cases for well-formed polarised sequents, the first may be called *unfocused*, with the particular case of being *neutral* when of the form −→<sup>Γ</sup> <sup>⇒</sup> <sup>R</sup>; and the latter two may be called *focused*.

**Definition 23 (Depolarised Nest).** *The depolarisation map extends to polarised nests* \$·% : <sup>B</sup>/≡<sup>±</sup> <sup>→</sup> <sup>B</sup>/<sup>≡</sup> *as follows:*

\${Π1, ..., Π<sup>n</sup>}+% = {\$Π1%, ..., \$Π<sup>n</sup>%}<sup>+</sup> \${Σ1, ..., Σ<sup>n</sup>}×% = {\$Σ1%, ..., \$Σ<sup>n</sup>%}<sup>×</sup>


**Fig. 3.** System fBI

#### **3.2 Focused Calculus**

We may now give the focused system. That is, the operational semantics for focused proof-search in LBI. All the rules, with the exception of P and N, are polarised versions of the rules from ηLBI.

**Definition 24 (System** fBI**).** *The focused system* fBI *is composed of the rules on Figure 3.*

Note the absence of a cut-rule, this is because the above system is intended to encapsulate precisely *focused* proof-search. Below we show that a cut-rule is indeed admissible, but proofs in fBI+cut are not necessarily focused themselves. Here the distinction between the methodologies for establishing the focusing principle becomes present since one may show completeness without leaving fBI by a permutation argument instead of a cut-elimination one.

The P and N rules will allow us to move a formula from one side to another during the proof of the completeness of fBI + cut (Lemma 34).The depolarised version are not directly present in LBI, but are derivable in LBI (Lemma 9). However, the way they are focused renders them not provable in fBI because it forces one to begin with a potentially *bad* choice; for example, A ∨ B ⇒ A ∨ B has no proof beginning with ∨R. In practice, they are a feature rather than a bug since they allow one to terminate proof-search early, without unnecessary further expansion of the axiom. In related works, such as [6,5], the analogous rules are eliminated by initially working with a weaker notion of focused proofsearch, and it is reasonable to suppose that the same may be true for BI. We leave this to future investigation.

Note also that, although it is perhaps proof-theoretically displeasing to incorporate weakening into the operational rules as in −∗ <sup>L</sup> and ∗ R, it has good computational behaviour during focused proof-search since the reduction of ϕ −∗ ψ can only arise out of an explicit choice made earlier in the computation.

Soundness follows immediately from the depolarisation map; that is, the interpretation of polarised sequents as nested sequents, and hence proofs in fBI actually are focused proofs in ηLBI.

**Theorem 25 (Soundness of** fBI**).** *Let* Γ *be a polarised nest and* N *a negative formula. If* fBI Γ ⇒ N *then* <sup>η</sup>LBI \$Γ%⇒\$N%

*Proof.* Every rule in fBI except the shift rules, as well as the P and N axioms, become a rule in ηBI when the antecedent(s) and consequent are depolarised. Instance of the shift rule can be ignored since the depolarised versions of the consequent and antecedents are the same. Finally, the depolarised versions of P and N follow from Lemma 9 with the use of some weakening.

*Example 26.* Consider the following proof in fBI, we suppose here that propositional letters A and C are negative, but B is positive.

<sup>A</sup> <sup>⇒</sup> <sup>A</sup> Ax<sup>−</sup> <sup>↓</sup><sup>A</sup> <sup>⇒</sup> <sup>A</sup> <sup>↓</sup><sup>L</sup> <sup>↓</sup><sup>A</sup> <sup>⇒</sup> <sup>↓</sup><sup>A</sup> <sup>↓</sup><sup>R</sup> <sup>B</sup> <sup>⇒</sup> <sup>B</sup> Ax<sup>+</sup> {↓A, B}<sup>×</sup> <sup>⇒</sup> <sup>↓</sup><sup>A</sup> <sup>∗</sup> <sup>B</sup> <sup>∗</sup><sup>R</sup> {↓A, B}<sup>×</sup> ⇒ ↑(↓<sup>A</sup> <sup>∗</sup> <sup>B</sup>) <sup>↑</sup><sup>R</sup> {↓A,↑B}<sup>×</sup> ⇒ ↑(↓<sup>A</sup> <sup>∗</sup> <sup>B</sup>) <sup>↑</sup><sup>L</sup> {↓A,↑<sup>B</sup> <sup>∧</sup><sup>−</sup> <sup>C</sup>}<sup>×</sup> ⇒ ↑(↓<sup>A</sup> <sup>∗</sup> <sup>B</sup>) <sup>∧</sup><sup>−</sup> L1 {↓A, <sup>↓</sup>(↑<sup>B</sup> <sup>∧</sup><sup>−</sup> <sup>C</sup>)}<sup>×</sup> ⇒ ↑(↓<sup>A</sup> <sup>∗</sup> <sup>B</sup>) <sup>↑</sup><sup>L</sup> (1) <sup>A</sup> <sup>⇒</sup> <sup>A</sup> Ax<sup>−</sup> <sup>↓</sup><sup>A</sup> <sup>⇒</sup> <sup>A</sup> <sup>↓</sup><sup>L</sup> <sup>↓</sup><sup>A</sup> <sup>⇒</sup> <sup>↓</sup><sup>A</sup> <sup>↓</sup><sup>R</sup> <sup>C</sup> <sup>⇒</sup> <sup>C</sup> Ax<sup>−</sup> <sup>↑</sup><sup>B</sup> <sup>∧</sup><sup>−</sup> <sup>C</sup> <sup>⇒</sup> <sup>C</sup> <sup>∧</sup><sup>−</sup> L2 <sup>↓</sup>(↑<sup>B</sup> <sup>∧</sup><sup>−</sup> <sup>C</sup>) <sup>⇒</sup> <sup>C</sup> <sup>↓</sup><sup>L</sup> <sup>↓</sup>(↑<sup>B</sup> <sup>∧</sup><sup>−</sup> <sup>C</sup>)}<sup>×</sup> <sup>⇒</sup> <sup>↓</sup><sup>C</sup> <sup>↓</sup><sup>R</sup> {↓A, <sup>↓</sup>(↑<sup>B</sup> <sup>∧</sup><sup>−</sup> <sup>C</sup>)}<sup>×</sup> <sup>⇒</sup> <sup>↓</sup><sup>A</sup> ∗ ↓<sup>C</sup> <sup>∗</sup><sup>R</sup> {↓A, <sup>↓</sup>(↑<sup>B</sup> <sup>∧</sup><sup>−</sup> <sup>C</sup>)}<sup>×</sup> ⇒ ↑(↓<sup>A</sup> ∗ ↓C) <sup>↑</sup><sup>R</sup> (2) {↓A, <sup>↓</sup>(↑<sup>B</sup> <sup>∧</sup><sup>−</sup> <sup>C</sup>)}<sup>×</sup> ⇒ ↑(↓<sup>A</sup> <sup>∗</sup> <sup>B</sup>) <sup>∧</sup><sup>−</sup> <sup>↑</sup>(↓<sup>A</sup> ∗ ↓C) <sup>∧</sup><sup>−</sup> R <sup>↓</sup><sup>A</sup> ∗ ↓(↑<sup>B</sup> <sup>∧</sup><sup>−</sup> <sup>C</sup>) ⇒ ↑(↓<sup>A</sup> <sup>∗</sup> <sup>B</sup>) <sup>∧</sup><sup>−</sup> <sup>↑</sup>(↓<sup>A</sup> ∗ ↓C) <sup>∗</sup><sup>L</sup> <sup>∅</sup><sup>×</sup> <sup>⇒</sup> (↓<sup>A</sup> ∗ ↓(↑<sup>B</sup> <sup>∧</sup><sup>−</sup> <sup>C</sup>)) −∗ (↑(↓<sup>A</sup> <sup>∗</sup> <sup>B</sup>) <sup>∧</sup><sup>−</sup> <sup>↑</sup>(↓<sup>A</sup> ∗ ↓C)) −∗<sup>R</sup>

It is a focused version of the proof given in Example 19. Observe that the only non-deterministic choices are which formula to focus on, such as in steps (1) and (2), where different choices have been made for the sake of demonstration. The point of focusing is that *only* at such points do choices that affect termination occur. The assignment of polarity to the propositional letters is what forced the shape of the proof; for example, if B had been negative the above would not have been well-formed. This phenomenon is standarly observed in focused systems (e.g. [7]).

We now introduce the tool which will allow us to show that if there is a proof of a sequent (*a priori* unstructured), then there is necessarily a focused one.

**Definition 27.** *All instances of the following rule where the sequents are wellformed are instances of* cut*, where* −→ϕ *denotes that* ϕ *is possibly prenexed with an additional shift*

$$\frac{\Delta \Rightarrow \varphi \quad \Gamma \{\overrightarrow{\varphi}\} \Rightarrow \chi}{\Gamma \{\Delta\} \Rightarrow \chi} \text{ cut}$$

Admissibility follows from the usual argument, but within the focused system; that is, through the upward permutation of cuts until they are eliminated in the axioms or are reduced in some other measure.

**Definition 28 (Good and Bad Cuts).** *Let* D *be a* fBI + cut *proof, a cut is a quadruple* L, R, C, ϕ *where* L *and* R *are the premises to a* cut *rule, concluding* C *in* D*, and* ϕ *is the* cut*-formula. They are classified as follows:*

*Good - If* ϕ *is principal in both* L *and* R*. Bad - If* ϕ *is not principal in one of* L *and* R*. Type 1: If* ϕ *is not principal in* L*. Type 2: If* ϕ *is not principal in* R*.*

**Definition 29 (Cut Ordering).** *The* cut*-rank of a cut* L, R, C, ϕ *in a proof is the triple* cut*-complexity,* cut*-duplicity,* cut*-level , where the* cut*-complexity is the size of* ϕ*, the* cut*-duplicity is the number of contraction instances above the cut, the* cut*-level is the sum of the heights of the sub-proofs concluding* L *and* R*.*

*Let* D *and* D *be two* fBI + cut *proofs, let* σ *and* σ *denote their multiset of cuts respectively. Proofs are ordered by* D≺D ⇐⇒ σ<σ *, where* < *is the multiset ordering derived from the lexicographic ordering on* cut*-rank.*

It follows from a result in [10] that the ordering on proofs is a well-order, since the ordering on cuts is a well-order.

**Lemma 30 (Good Cuts Elimination).** *Let* D *be a* fBI+cut *proof of* S*; there is a* fBI + cut *proof* D *of* S *containing no good cuts such that* D ( D*.*

*Proof.* Let D be as in hypothesis, if it contains no good cuts then D = D gives the desired proof. Otherwise, there is at least one good cut L, R, C, ϕ. Let ∂ be the sub-proof in D concluding C, then there is a transformation ∂ → ∂ where ∂ is a fBI + cut proof of S with ∂ ≺ ∂ such that the multiset of good cuts in ∂ is smaller (with respect to ≺) than the multiset of good cuts in ∂. Since ≺ is a well-order indefinitely replacing ∂ with ∂ in D for various cuts yields the desired D .

The key step is that a cut of a certain cut-complexity is replaced by cuts of lower cut-complexity, possibly increasing the cut-duplicity or cut-level of other cuts in the proof, but not modifying their complexity.

{ −→Γ , A<sup>+</sup>}<sup>+</sup> <sup>⇒</sup> <sup>A</sup><sup>+</sup> Ax<sup>+</sup> −→<sup>Γ</sup> {A<sup>+</sup>} ⇒ <sup>A</sup><sup>+</sup> −→<sup>Γ</sup> {{−→<sup>Γ</sup> , A<sup>+</sup>}+} ⇒ <sup>A</sup><sup>+</sup> cut → −→<sup>Γ</sup> {A<sup>+</sup>} ⇒ <sup>A</sup><sup>+</sup> −→<sup>Γ</sup> {{−→<sup>Γ</sup> , A<sup>+</sup>}+} ⇒ <sup>A</sup><sup>+</sup> W { −→Δ, P}<sup>×</sup> <sup>⇒</sup> <sup>N</sup> −→Δ <sup>⇒</sup> <sup>P</sup> −∗ <sup>N</sup> −∗<sup>R</sup> −→<sup>Δ</sup> <sup>⇒</sup> <sup>P</sup> <sup>Γ</sup>{ −→Δ ,N}<sup>×</sup> ⇒ R −→Γ { −→Δ, −→Δ , { −→Δ,<sup>P</sup> −∗ <sup>N</sup>}+}<sup>×</sup> <sup>⇒</sup> <sup>R</sup> −∗<sup>L</sup> −→Γ { −→Δ, −→Δ , { −→Δ, −→Δ}+}<sup>×</sup> <sup>⇒</sup> <sup>R</sup> cut → −→<sup>Δ</sup> <sup>⇒</sup> <sup>P</sup> { −→Δ, P}<sup>×</sup> <sup>⇒</sup> <sup>N</sup> −→<sup>Γ</sup> { −→Δ ,N}<sup>×</sup> ⇒ R −→Γ { −→Δ, −→Δ, P}<sup>×</sup> <sup>⇒</sup> <sup>R</sup> cut −→Γ { −→Δ, −→Δ , −→Δ}<sup>×</sup> <sup>⇒</sup> <sup>R</sup> cut −→Γ { −→Δ, −→Δ , { −→Δ, −→Δ}+}<sup>×</sup> <sup>⇒</sup> <sup>R</sup> W

We denote by a double-line the fact that we do not actually use a weakening, but only the fact that it is admissible in fBI by construction (Lemma 10).

**Lemma 31 (Bad Cuts Elimination).** *Let* D *be a* fBI + cut *proof of* S *that contains only one cut which is bad, then there is a* fBI + cut *proof* D *of* S *such that* D ≺ D*.*

*Proof.* Without loss of generality suppose the cut is the last inference in the proof, then it may be replaced by other cuts whose cut-level or cut-duplicity is smaller, but with same cut-complexity.

First we consider bad cuts when L and R are both axioms. There are no Type 1 bad cuts on axioms as the formula is always principal, meanwhile the Type 2 bad cuts can trivially be permuted upwards or ignored; for example,

{ −→Δ, A+}<sup>+</sup> <sup>⇒</sup> <sup>A</sup><sup>+</sup> Ax<sup>+</sup> −→<sup>Δ</sup> <sup>⇒</sup> <sup>P</sup> −→<sup>Γ</sup> { −→Δ ,N}<sup>×</sup> ⇒ R −→Γ { −→Δ, −→Δ , { −→Δ, A+,<sup>P</sup> −∗ <sup>N</sup>}+}<sup>×</sup> <sup>⇒</sup> <sup>R</sup> −∗<sup>L</sup> −→Γ { −→Δ, −→Δ , { −→Δ, −→Δ, A+,<sup>P</sup> −∗ <sup>N</sup>}+}<sup>×</sup> <sup>⇒</sup> <sup>R</sup> cut → −→<sup>Δ</sup> <sup>⇒</sup> <sup>P</sup> −→<sup>Γ</sup> { −→Δ ,N}<sup>×</sup> ⇒ R −→Γ { −→Δ, −→Δ , { −→Δ, A+,<sup>P</sup> −∗ <sup>N</sup>}+}<sup>×</sup> <sup>⇒</sup> <sup>R</sup> −∗<sup>L</sup> −→Γ { −→Δ, −→Δ , { −→Δ, −→Δ, A+,<sup>P</sup> −∗ <sup>N</sup>}+}<sup>×</sup> <sup>⇒</sup> <sup>R</sup> W

Here again we are using an appropriate version of Lemma 10.

For the remaining cases the cuts are commutative in the sense that they may be permuted upward thereby reducing the cut-level. An example is given below.

$$\begin{array}{c} \begin{array}{c} \overrightarrow{\Delta}\{\langle N\_{1}\rangle\} \Rightarrow M\\ \overrightarrow{\Delta}\{\langle N\_{1}\wedge^{-}N\_{2}\rangle\} \Rightarrow M\\ \end{array} \wedge^{-}\_{\mathsf{i}\perp} & \begin{array}{c} \begin{array}{c} \neg\exists\{M\}\Rightarrow R\\ \end{array} \begin{array}{c} \begin{array}{c} \neg\exists\{\langle N\_{1}\rangle\}\Rightarrow M\\ \end{array} \begin{array}{c} \Gamma\{\mathsf{M}\}\Rightarrow R\\ \end{array} \end{array} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \begin{array}{c} \Delta\{\langle N\_{1}\rangle\}\Rightarrow M\\ \end{array} \begin{array}{c} \Gamma\{\mathsf{M}\}\Rightarrow R\\ \end{array} \begin{array}{c} \Gamma\{\mathsf{M}\}\Rightarrow R\\ \end{array} \begin{array}{c} \Gamma\{\mathsf{M}\}\Rightarrow R\\ \end{array} \begin{array}{c} \Gamma\{\mathsf{M}\}\Rightarrow R\\ \end{array} \end{array} \end{array} \right] \end{array}$$

The exceptional case is the interaction with contraction where the cut is replaced by cuts of possibly equal cut-level, but cut-duplicity decreases.

$$\begin{array}{c} \overrightarrow{\triangle}^{\prime} \Rightarrow \langle L \rangle \quad \frac{\overrightarrow{\top^{\prime}} \{ \{ \overrightarrow{\Delta} \{ \{ \} \}, \overrightarrow{\Delta} \{ \{ \} \} \}\_{+} \right) \Rightarrow R}{\overrightarrow{\top^{\prime}} \{ \{ \overrightarrow{\Delta} \{ \} \} \Rightarrow R} \, \text{cut} \\ \frac{\overrightarrow{\top^{\prime}} \{ \overrightarrow{\Delta} \{ \overrightarrow{\Delta}^{\prime} \} \} \Rightarrow R}{\overrightarrow{\top^{\prime}} \{ \overrightarrow{\Delta} \} \, \overrightarrow{\Delta}^{\prime} \{ \} \Rightarrow R} \, \text{cut} \\ \frac{\overrightarrow{\Delta}^{\prime} \Rightarrow \langle L \rangle \quad \overrightarrow{\top^{\prime}} \{ \{ \} \} \, \overleftarrow{\Delta} \{ \{ \} \} \, \overleftarrow{\Delta} \{ \{ \} \} \, \overleftarrow{\Delta} \} \, \text{cut} \\ \Rightarrow \quad \frac{\overrightarrow{\top^{\prime}} \{ \{ \} \} \, \overleftarrow{\Delta} \{ \overline{\Delta}^{\prime} \} \, \overleftarrow{\Delta} \{ \overline{\Delta}^{\prime} \} \} \, \text{} \Rightarrow R}{\overrightarrow{\top^{\prime}} \{ \{ \overline{\Delta} \{ \overline{\Delta}^{\prime} \} , \overline{\Delta} \{ \overline{\Delta}^{\prime} \} \} \, \text{} \Rightarrow R} \, \text{cut} \end{array} \begin{array}{c} \Pi \Rightarrow R} \\ \begin{array}{c} \Pi \Rightarrow R} \\ \Pi \Rightarrow \Pi \, \overleftarrow{\Delta} \, \overleftarrow{\Delta} \, \overleftarrow{\Delta} \, \overleftarrow{\} \, \overleftarrow{\Delta} \, \overleftarrow{\Delta} \, \overleftarrow{\} \, \overleftarrow{\} \, \overleftarrow{\} \, \overleftarrow{\} \, \overleftarrow$$

**Theorem 32 (Cut-elimination in** fBI**).** *Let* Γ *be a positive nest and* N *a negative formula. Then,* fBI Γ ⇒ N *if and only if* fBI+cut Γ ⇒ N*.*

*Proof.* (⇒) Trivial as any fBI-proof is a fBI + cut-proof. (⇐) Let D be a fBI + cut-proof of Γ ⇒ N, if it has no cuts then it is a fBI-proof so we are done. Otherwise, there is at least one cut, and we proceed by well-founded induction on the ordering of proofs and sub-proofs of D with respect to ≺.

**Base Case.** Assume D is minimal with respect to ≺ with at least one cut; without loss of generality, by Lemma 30, assume the cut is bad. It follows from Lemma 31 that there is a proof strictly smaller in ≺-ordering, but this proof must be cut-free as D is minimal.

**Inductive Step.** Let D be as in the hypothesis, then by Lemma 30 there is a proof ∂ of Γ ⇒ N containing no good cuts such that D ( D. Either D is cut-free and we are done, or it contains bad cuts. Consider the topmost cut, and denote the sub-proof by ∂, it follows from Lemma 31 that there is a proof ∂ of the same sequent such that ∂ ≺ ∂. Hence, by inductive hypothesis, there is a cut-free proof the sequent and replacing ∂ by this proof in D gives a proof of Γ ⇒ ϕ strictly smaller in ≺-ordering, thus by inductive hypothesis there is a cut-free proof as required.

#### **3.3 Completeness of fBI**

The completeness theorem of the focused system, the operational semantics, is with respect to an interpretation (i.e. a polarisation). Indeed, any polarisation may be considered; for example, both (↓A<sup>−</sup> <sup>∗</sup>B<sup>+</sup>)∧<sup>+</sup> <sup>↓</sup>A<sup>−</sup> and <sup>↓</sup>(A<sup>+</sup> ∗↓B−)∧<sup>+</sup> <sup>A</sup><sup>+</sup> are correct polarised versions of the formulas (A ∗ B) ∧ A. Taking arbitrary ϕ the process is as follows: first, fix a polarised syntax (i.e. a partition of the propositional letters into positive and negative sets), then assign a polarity to ϕ with the following steps:


**–** If ϕ = ψ<sup>1</sup> ◦ψ<sup>2</sup> where ◦ ∈ {∗, −∗, →, ∨}, then polarise ψ<sup>1</sup> and ψ<sup>2</sup> and combine with ◦ accordingly, using shifts where necessary.

*Example 33.* Suppose A is negative and B is positive, then (A ∗ B) ∧ A may be polarised by choosing the additive conjunction to be positive resulting in (↓<sup>A</sup> <sup>∗</sup> <sup>B</sup>) <sup>∧</sup><sup>+</sup> <sup>↓</sup><sup>A</sup> (when <sup>↓</sup>(<sup>A</sup> ∗ ↓B) <sup>∧</sup><sup>+</sup> <sup>A</sup>) would not be well-formed). Choosing to shift one can ascribe a negative polarisation <sup>↑</sup>((↓<sup>A</sup> <sup>∗</sup> <sup>B</sup>) <sup>∧</sup><sup>+</sup> <sup>↓</sup>A).

The above generates the set of all such polarised formulas when all possible choices are explored. The free assignment of polarity to formulas means several distinct focusing procedures are captured by the completeness theorem.

**Lemma 34 (Completeness of** fBI+cut**).** *For any unfocused sequent* Γ ⇒ N*, if* <sup>η</sup>LBI \$Γ ⇒ N% *then* fBI+cut Γ ⇒ N*.*

*Proof.* We show that every rule in ηLBI is derivable in fBI + cut, consequently every proof in ηLBI may be simulated; hence, every provable sequent has a focused proof. For unfocused rules →R, −∗R, ∧<sup>−</sup> <sup>R</sup> , <sup>∧</sup><sup>+</sup> <sup>L</sup> , ∨L, ∗L, ⊥L, <sup>−</sup> <sup>R</sup> , <sup>+</sup> <sup>L</sup> , <sup>∗</sup> <sup>L</sup>, this is immediate; as well as for Ax and C. Below we give an example on how to simulate a focused rule.

Where it does not matter (e.g. in the case of inactive nests), we do not distinguish the polarised and unpolarised versions; each of the simulations can be closed thanks to the presence of the P and N rules in fBI.

Γ ⇒ ϕ Δ ⇒ ψ {{Γ,Δ}×, Δ }<sup>+</sup> <sup>⇒</sup> <sup>ϕ</sup> <sup>∗</sup> <sup>ψ</sup> <sup>∗</sup><sup>R</sup> in <sup>η</sup>LBI is simulated in fBI <sup>+</sup> cut by <sup>Γ</sup> ⇒ ↑ϕ<sup>+</sup> <sup>Γ</sup> <sup>⇒</sup> ↓↑ϕ<sup>+</sup> <sup>↓</sup><sup>R</sup> <sup>Δ</sup> ⇒ ↑ψ<sup>+</sup> <sup>Δ</sup> <sup>⇒</sup> ↓↑ψ<sup>+</sup> <sup>↓</sup><sup>R</sup> {Γ,Δ}<sup>×</sup> <sup>⇒</sup> ↓↑ϕ<sup>+</sup> ∗ ↓↑ψ<sup>+</sup> <sup>∗</sup><sup>R</sup> ↓↑ϕ<sup>+</sup> <sup>⇒</sup> <sup>ϕ</sup><sup>+</sup> <sup>P</sup> ↓↑ψ<sup>+</sup> <sup>⇒</sup> <sup>ψ</sup><sup>+</sup> <sup>P</sup> {{↓↑ϕ<sup>+</sup>, ↓↑ψ<sup>+</sup>}×, Δ }<sup>+</sup> <sup>⇒</sup> <sup>ϕ</sup><sup>+</sup> <sup>∗</sup> <sup>ψ</sup><sup>+</sup> <sup>∗</sup><sup>R</sup> {{↓↑ϕ<sup>+</sup>, ↓↑ψ<sup>+</sup>}×, Δ }<sup>+</sup> ⇒ ↑(ϕ<sup>+</sup> <sup>∗</sup> <sup>ψ</sup><sup>+</sup>) <sup>↑</sup><sup>R</sup> {↓↑ϕ<sup>+</sup> ∗ ↓↑ψ<sup>+</sup>, Δ }<sup>+</sup> ⇒ ↑(ϕ<sup>+</sup> <sup>∗</sup> <sup>ψ</sup><sup>+</sup>) <sup>∗</sup><sup>L</sup> {{Γ,Δ}×, Δ }<sup>+</sup> ⇒ ↑(ϕ<sup>+</sup> <sup>∗</sup> <sup>ψ</sup><sup>+</sup>) cut 

**Theorem 35 (Completeness of** fBI**).** *For any unfocused* Γ ⇒ N*, if* <sup>η</sup>LBI \$Γ ⇒ N% *then* fBI Γ ⇒ N*.*

*Proof.* It follows from Lemma 34 that there is a proof of Γ ⇒ N in fBI + cut, and then it follows from Lemma 32 that there is a proof of Γ ⇒ N in fBI.

Given an arbitrary sequent the above theorem guarantees the existence of a focused proof, thus the focusing principle holds for ηLBI and therefore for LBI.

#### **4 Conclusion**

By proving the completeness of a focused sequent calculus for the logic of Bunched Implications, we have demonstrated that it satisfies the focusing principle; that is, any polarisation of a BI-provable sequent can be proved following a focused search procedure. This required a careful analysis of how to restrict the usage of structural rules. In particular, we had to fully develop the congruenceinvariant representation of bunches as nested multisets (originally proposed in [12]) to treat the exchange rule within bunched structures.

Proof-theoretically the completeness of the focused systems suggests a syntactic orderliness of LBI, though the P and N rules leave something to be desired. Computationally, these axioms are unproblematic as during search it makes sense to terminate a branch as soon as possible; however, unless they may be eliminated it means that the focusing principle holds in BI only up to a point. In related works (c.f. [6]) the analogous problem is overcome by first considering a *weak* focused system; that is, one where the structural rules are not controlled and unfocused rules may be performed inside focused phases if desired. Completeness of (strong) focusing is achieved by appealing to a *synthetic* system. It seems reasonable to suppose the same can be done for BI, resulting in a more proof-theoretically satisfactory focused calculus, exploring this possibility is a natural extension of the work on fBI.

The methodology employed for proving the focusing principle can be interpreted as soundness and completeness of an operational semantics for goaldirected search. The robustness of this technique is demonstrated by its efficacy in modal [6,5] and substructural logics [26], including now bunched ones. Although BI may be the most employed bunched logic, there are a number of others, such as the family of relevant logics [36], and the family of bunched logics [11], for which the focusing principle should be studied. However, without the presence of a cut-free sequent calculus goal-directed search becomes unclear, and currently such calculi do not exist for the two main variants of BI: Boolean BI [33] and Classical BI [4]. On the other hand, large families of bunched and substructural logics have been given hypersequent calculi [8,9]. Effective proof-search procedures have been established for the hypersequent calculi in the substructural case [35], but not the bunched one, and focused proof-search for neither. There is a technical challenge in focusing these systems as one must not only decide which formula to reduce, but also which sequent.

In the future it will be especially interesting to see how focused search, when combined with the expressiveness of BI, increases its modelling capabilities. Indeed, the dynamics of proof-search can be used to represent models of computation within (propositional) logics; for example, the undecidability of Linear Logic involves simulating two-counter machines [26]. One particularly interesting direction is to see how focused proof-search in BI may prove valuable within the context of Separation Logic. Focused systems in particular have been used to emulate proofs for other logics [27]; and to give structural operational semantics for systems used in industry, such as algorithms for solving constraint satisfaction problems [14]. A more immediate possibility though is the formulation of a theorem prover; we leave providing specific implementation or benchmarks to future research.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Interpolation and Amalgamation for Arrays with MaxDiff**

Silvio Ghilardi<sup>1</sup> , Alessandro Gianola2(-) , and Deepak Kapur<sup>3</sup> 

<sup>1</sup> Dipartimento di Matematica, Universit`a degli Studi di Milano, Milano, Italy

<sup>2</sup> Faculty of Computer Science, Free University of Bozen-Bolzano, Bolzano, Italy gianola@inf.unibz.it

<sup>3</sup> Department of Computer Science, University of New Mexico, Albuquerque, USA

**Abstract.** In this paper, the theory of McCarthy's extensional arrays enriched with a maxdiff operation (this operation returns the biggest index where two given arrays differ) is proposed. It is known from the literature that a diff operation is required for the theory of arrays in order to enjoy the Craig interpolation property at the quantifier-free level. However, the diff operation introduced in the literature is merely instrumental to this purpose and has only a purely formal meaning (it is obtained from the Skolemization of the extensionality axiom). Our maxdiff operation significantly increases the level of expressivity; however, obtaining interpolation results for the resulting theory becomes a surprisingly hard task. We obtain such results via a thorough semantic analysis of the models of the theory and of their amalgamation properties. The results are modular with respect to the index theory and it is shown how to convert them into concrete interpolation algorithms via a hierarchical approach.

**Keywords:** Interpolation · Arrays · Amalgamation · SMT

### **1 Introduction**

Since McMillan's seminal papers [31,32], interpolation has been successfully applied in software model checking, also in combination with orthogonal techniques like PDR [38] or k-induction [29]. The reason why interpolation techniques are so attractive is because they allow to discover in a completely *automatic* way new atoms (improperly often called 'predicates') that might contribute to the construction of invariants. In fact, software model-checking problems are typically infinite state, so invariant synthesis may require introducing formulae whose search is not finitely bounded. One way to discover them is to analyze spurious error traces; for instance, if the system under examination (described by a transition formula T r(x, x )) cannot reach in n-step an error configuration in U(x) starting from an initial configuration in In(x), this means that the formula

In(x0) <sup>∧</sup> T r(x0, x1) ∧···∧ T r(x<sup>n</sup>−<sup>1</sup>, xn) <sup>∧</sup> <sup>U</sup>(xn)

 The third author has been partially supported by the National Science Foundation CCF award 1908804.

<sup>©</sup> The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 268–288, 2021.

https://doi.org/10.1007/978-3-030-71995-1 14

is inconsistent (modulo a suitable theory T). From the inconsistency proof, by computing an interpolant, say at the i-th iteration, one can produce a formula φ(x) such that, modulo T, we have

$$\operatorname{In}(\underline{x}\_0) \land \bigwedge\_{j=0}^i \operatorname{Tr}(\underline{x}\_{j-1}, \underline{x}\_j) \vdash \phi(\underline{x}\_i) \text{ and } \ \phi(\underline{x}\_i) \land \bigwedge\_{j=i+1}^n \operatorname{Tr}(\underline{x}\_{j-1}, \underline{x}\_j) \land U(\underline{x}\_n) \vdash \bot \tag{1}$$

This formula (and the atoms it contains) can contribute to the refinement of the current candidate loop invariant guaranteeing safey. This fact can be exploited in very different ways during invariant search, depending on the various techniques employed. It should be noticed however that interpolants are not unique and that different interpolation algorithms may return interpolants of different quality: all interpolants restrict search, but not all of them might be conclusive.

This new application of interpolation is different from the role of interpolants for analyzing proof theories of various logics starting with the pioneering works of [15,24,34]. It should be said however that Craig interpolation theorem in first order logic does not give by itself any information on the shape the interpolant can have when a specific theory is involved. Nevertheless, this is crucial for the applications: when we extract an interpolant from a trace like (1), we are typically handling a theory which might be undecidable, but whose quantifier-free fragment is decidable for satisfiability (usually within a somewhat 'reasonable' computational complexity). Thus, it is desirable (although not always possible) that the interpolant is quantifier-free, a fact which is not guaranteed in the general case. This is why a lot of effort has been made in analyzing *quantifier-free* interpolation, also exploiting its connection to semantic properties like amalgamation and strong amalgamation (see [9] for comprehensive results in the area).

The specific theories we want to analyze in this paper are variants of *Mc-Carthy's theory of arrays* [30] *with extensionality* (see Section 3 below for a detailed description). The main operations considered in this theory are the *write* operation (i.e. the array update) and the *read* operation (i.e., the access to the content of an array cell). As such, this theory is suitable to formalize programs over arrays, like standard copying, comparing, searching, sorting, etc. functions; verification problems of this kind are collected in the SV-COMP benchmarks category "ReachSafety-Arrays"<sup>4</sup>, where safety verification tasks involving arrays of *finite but unknown length* are considered.

By itself, the theory of arrays with extensionality does not have quantifier free interpolation [28] <sup>5</sup>; however, in [8] it was shown that quantifier-free interpolation is restored if one enriches the language with a binary function skolemizing the extensionality axiom (the result was confirmed - via different interpolation algorithms - in [23,37]). Such a Skolem function, applied to two array variables

<sup>4</sup> https://sv-comp.sosy-lab.org/2020/benchmarks.php

<sup>5</sup> This is the counterexample (due to R. Jhala): the formula x = wr(y, i, e) is inconsistent with the formula rd(x, j) = rd(y, j) ∧ rd(x, k) = rd(y, k) ∧ j = k, but all possible interpolants require quantifiers to be written (with diff symbols, instead, it is possible to write down an interpolant without quantifiers, as shown in [8]).

a, b, returns an index diff(a, b) where a, b differ (it returns an arbitrary value if a is equal to b). This semantics for the diff operation is very undetermined and does not have a significant interpretation in concrete programs. That is why we propose to modify it in order to give it a defined and natural meaning: we ask for diff(a, b) to return the *biggest* index where a, b differ (in case a = b we ask for diff(a, b) to be the minimum index 0). Since it is natural to view arrays as functions defined on initial intervals of the nonnegative integers, this choice has a clear semantic motivation. The expressive power of the theory of arrays so enriched becomes bigger: for instance, if we also add to the language a constant symbol for the undefined array constantly equal to some 'undefined' value ⊥ (where ⊥ is meant to be different from the values a[i] actually in use), then we can define |a| as diff(a, ). In this way we can model the fact that a is undefined outside the interval [ 0, |a| ] - this is useful to formalize the above mentioned SV-COMP benchmarks.

The effectiveness of quantifier-free interpolation in the theory of arrays with maxdiff is exemplified in the simple example of Figure 1: the invariant certifying the assert in line 7 of the Strcpy algorithm can be obtained taking a suitable quantifier-free interpolant out of the spurious trace (1) already for n = 2. In more realistic examples, as witnessed by current research [2,3,4,5,16,22,25,13], it is quite clear that useful invariants require universal quantifiers to be expressed and if undecidable fragments are invaded, incomplete solvers must be used. However, even in such circumstances, quantifier-free interpolation does not lose its interest: for instance, the tool Booster [5] <sup>6</sup> synthesizes universally quantified invariants out of quantifer-free interpolants (quantifier-free interpolation problems are generated by negating and skolemizing universally quantified formulae arising during invariants search, see [4] for details).

$$\begin{array}{llll} \texttt{1 \ int \ }\texttt{a}[\mathsf{N}];\\ \texttt{2 \ int \ }\texttt{b}[\mathsf{N}];\\ \texttt{3 \ int \ }\texttt{1 \ min \ }\texttt{I} = 0;\\ \texttt{4 \ while\ }\texttt{1} < \texttt{N}\,\texttt{do} & \begin{array}{llll} & -\,\,In(a,b,I) \equiv I = 0 \land |a| = N - 1 \land |b| = N - 1 \land N > 0 \\\ & -\,\,Tr(a,b,I,a',b',I') \equiv I < N \land I' = I + 1 \land a' = 0 \\\ & a \land b' = \mathit{wr}(b,I,rd(a,I)) \\\ \mathsf{I} + \;\,i \mathrel{+} & -\,\,U(a,b) \equiv a \neq b \land I = N \\\ \texttt{7 \ asset \ t(a=b)}; \end{array} \\ \texttt{8 \ int \ }\texttt{9};\\ \texttt{1 \ min \ }\texttt{1}\texttt{a}\texttt{b} \texttt{c} \ \texttt{b} \ \texttt{1}\texttt{a} \ \texttt{c} \ \texttt{b} \end{array}$$

**Fig. 1.** Strcpy function: code and associated transition system (with program counter missed in the latter for simplicity). Loop invariant: a = b ∨ (N > diff(a, b) ∧ diff(a, b) ≥ I).

Proving that the theory of arrays with the above 'maxdiff' operation enjoys quantifier-free interpolation revealed to be a surprisingly difficult task. In

<sup>6</sup> Booster is no longer maintained, however it is still referred to in current experimental evaluations [16,13].

the end, the interpolation algorithm we obtain resembles the interpolation algorithms generated via the hierarchic locality techniques introduced in [35,36] and employed also in [37]; however, its correctness, completeness and termination proofs require a large d´etour going through non-trivial model-theoretic arguments (these arguments do not substantially simplify adopting the complex framework of 'amalgamation closures' and 'W-separability' of [37], and that is the reason why we preferred to supply direct proofs).

This paper concentrates on theoretical and methodological results, rather than on experimental aspects. It is almost completely dedicated to the correctness and completeness poof of our interpolation algorithm: in Subsection 3.1 we summarize our proof plan and supply basic intuitions. The paper is structured as follows: in Section 2 we recall some background, in Section 3 we introduce our theory of arrays with maxdiff; Sections 4 and 5 supply the semantic proof of the amalgamation theorem; Sections 6 and 7 are dedicated to the algorithmic aspects, whereas Section 8 analyzes complexity for the restricted case where indexes are constrained by the theory of total orders. In the final Section 9, we mention some still open problems. The main results in the paper are Theorems 2,4,5: for space reasons, *all proofs of these theorems will be only sketched*, full details are nevertheless supplied in the online available extended version [21]. This extended version contains additional material on complexity analysis and implementation. It contains also a proof about nonexistence of uniform interpolants (see [26,27,20,10,11,12] for the definition and more information on uniform interpolants).

#### **2 Formal Preliminaries**

We assume the usual syntactic (e.g., signature, variable, term, atom, literal, formula, and sentence) and semantic (e.g., structure, sub-structure, truth, satisfiability, and validity) notions of (possibly many-sorted) first-order logic. The equality symbol "=" is included in all signatures considered below. Notations like E(x) mean that the expression (term, literal, formula, etc.) E contains free variables only from the tuple x. A 'tuple of variables' is a list of variables without repetitions and a 'tuple of terms' is a list of terms (possibly with repetitions). Finally, whenever we use a notation like E(x, y) we implicitly assume not only that both the x and the y are pairwise distinct, but also that x and y are disjoint. A *constraint* is a conjunction of literals. A formula is *universal* (*existential*) iff it is obtained from a quantifier-free formula by prefixing it with a string of universal (existential, resp.) quantifiers.

*Theories and satisfiability modulo theory.* A *theory* T is a pair (Σ, Ax<sup>T</sup> ), where Σ is a signature and Ax<sup>T</sup> is a set of Σ-sentences, called the *axioms* of T (we shall sometimes write directly T for Ax<sup>T</sup> ). The *models* of T are those Σ-structures in which all the sentences in Ax<sup>T</sup> are true. A Σ-formula φ is T*-satisfiable* (or T-consistent) if there exists a model M of T such that φ is true in M under a suitable assignment a to the free variables of φ (in symbols, (M, a) |= φ); it is T*-valid* (in symbols, T ϕ) if its negation is T-unsatisfiable or, equivalently, ϕ is provable from the axioms of T in a complete calculus for first-order logic. A theory T = (Σ, Ax<sup>T</sup> ) is *universal* iff all sentences in Ax<sup>T</sup> are universal. A formula ϕ<sup>1</sup> T*-entails* a formula ϕ<sup>2</sup> if ϕ<sup>1</sup> → ϕ<sup>2</sup> is T*-valid* (in symbols, ϕ<sup>1</sup> <sup>T</sup> ϕ<sup>2</sup> or simply ϕ<sup>1</sup> ϕ<sup>2</sup> when T is clear from the context). If Γ is a set of formulæ and φ a formula, Γ <sup>T</sup> φ means that there are γ1,...,γ<sup>n</sup> ∈ Γ such that γ1∧···∧γ<sup>n</sup> <sup>T</sup> φ. The *satisfiability modulo the theory* T (SMT(T)) *problem* amounts to establishing the T-satisfiability of quantifier-free Σ-formulæ (equivalently, the T-satisfiability of Σ-constraints). A theory T admits *quantifier-elimination* iff for every formula φ(x) there is a quantifier-free formula φ (x) such that T φ ↔ φ .

Some theories have special names, which are becoming standard in SMTliterature; for instance, EUF(Σ) is the pure equality theory in the signature Σ (this is commonly abbreviated as EUF if there is no need to specify the signature Σ). More standard theory names will be recalled during the paper.

*Embeddings and sub-structures* The support of a structure M is denoted with |M|. For a (sort, function, relation) symbol σ, we denote as σ<sup>M</sup> the interpretation of σ in M. An embedding is a homomorphism that preserves and reflects relations and operations (see, e.g., [14]). Formally, a Σ*-embedding* (or, simply, an embedding) between two Σ-structures M and N is any mapping μ : |M| −→ |N | satisfying the following three conditions: (a) it is a (sortpreserving) injective function; (b) it is an algebraic homomorphism, that is for every n-ary function symbol f and for every a1,...,a<sup>n</sup> ∈ |M|, we have f<sup>N</sup> (μ(a1),...,μ(an)) = μ(fM(a1,...,an)); (c) it preserves and reflects predicates, i.e. for every n-ary predicate symbol P, we have (a1,...,an) ∈ P<sup>M</sup> iff (μ(a1),...,μ(an)) ∈ P <sup>N</sup> . If |M| ⊆ |N | and the embedding μ : M −→ N is just the identity inclusion |M| ⊆ |N |, we say that M is a *substructure* of N or that N is a *superstructure* of M. As it is known, the truth of a universal (resp. existential) sentence is preserved through substructures (resp. superstructures).

*Combinations of theories.* A theory T is *stably infinite* iff every T-satisfiable quantifier-free formula (from the signature of T) is satisfiable in an infinite model of T. By compactness, it is possible to show that T is stably infinite iff every model of T embeds into an infinite one (see, e.g., [17]). A theory T is *convex* iff for every conjunction of literals δ, if δ <sup>T</sup> n <sup>i</sup>=1 x<sup>i</sup> = y<sup>i</sup> then δ <sup>T</sup> x<sup>i</sup> = y<sup>i</sup> holds for some i ∈ {1, ..., n}. Let T<sup>i</sup> be a stably-infinite theory over the signature Σ<sup>i</sup> such that the SMT(Ti) problem is decidable for i = 1, 2 and such that Σ<sup>1</sup> and Σ<sup>2</sup> are disjoint (i.e. the only shared symbol is equality). Under these assumptions, the *Nelson-Oppen combination result* [33] says that the SMT problem for the combination T<sup>1</sup> ∪ T<sup>2</sup> of the theories T<sup>1</sup> and T<sup>2</sup> is decidable.

*Interpolation properties.* Craig's interpolation theorem [14] roughly states that if a formula φ implies a formula ψ then there is a third formula θ, called an interpolant, such that φ implies θ, θ implies ψ, and every non-logical symbol in θ occurs both in φ and ψ. Our interest is to specialize this result to the computation of quantifier-free interpolants modulo (combinations of) theories.

**Definition 1.** *[Plain quantifier-free interpolation] A theory* T admits (plain) quantifier-free interpolation *(or, equivalently,* has quantifier-free interpolants*) iff for every pair of quantifier-free formulae* φ, ψ *such that* ψ ∧ φ *is* T*-unsatisfiable, there exists a quantifier-free formula* θ*, called an* interpolant*, such that: (i)* ψ T*-entails* θ*, (ii)* θ ∧φ *is* T*-unsatisfiable, and (iii) only the variables occurring in both* ψ *and* φ *occur in* θ*.*

In verification, the following extension of Definition 1 is considered more useful.

**Definition 2.** *[General quantifier-free interpolation] Let* T *be a theory in a signature* Σ*; we say that* T *has the* general quantifier-free interpolation property *iff for every signature* Σ *(disjoint from* Σ*) and for every pair of ground* Σ ∪Σ  *formulæ* φ, ψ *such that* <sup>φ</sup>∧<sup>ψ</sup> *is* <sup>T</sup>*-unsatisfiable*<sup>7</sup>*, there is a ground formula* <sup>θ</sup> *such that: (i)* φ T*-entails* θ*; (ii)* θ ∧ ψ *is* T*-unsatisfiable; (iv) all relations, constants and function symbols from* Σ *occurring in* θ *also occur in* φ *and* ψ*.*

By replacing free variables with free constants, it should be clear that general quantifier-free interpolation (Definition 2) implies plain quantifier-free interpolation (Definition 1); however, the converse implication does not hold.

*Amalgamation and strong amalgamation.* Interpolation can be characterized semantically via amalgamation.

**Definition 3.** *A universal theory* T *has the* amalgamation property *iff given models* M<sup>1</sup> *and* M<sup>2</sup> *of* T *and a common submodel* A *of them, there exists a further model* M *of* T *(called* T*-amalgam) endowed with embeddings* μ<sup>1</sup> : M<sup>1</sup> −→ M *and* μ<sup>2</sup> : M<sup>2</sup> −→ M *whose restrictions to* |A| *coincide.*

*A universal theory* T *has the* strong amalgamation property *if the above embeddings* μ1, μ<sup>2</sup> *and the above model* M *can be chosen so to satisfy the following additional condition: if, for some* m<sup>1</sup> ∈ |M1|, m<sup>2</sup> ∈ |M2|*,* μ1(m1) = μ2(m2) *holds, then there exists an element* a *in* |A| *such that* m<sup>1</sup> = a = m2*.*

The first statement of the following theorem is an old result due to [6]; the second statement is proved in [9] (where it is also suitably reformulated for theories which are not universal):

#### **Theorem 1.** *Let* T *be a universal theory. Then*


We underline that, in presence of stable infiniteness, strong amalgamation is a modular property (in the sense that it transfers to signature-disjoint unions of theories), whereas amalgamation is not (see again [9] for details).

<sup>7</sup> By this (and similar notions) we mean that <sup>φ</sup>∧<sup>ψ</sup> is unsatisfiable in all <sup>Σ</sup>- -structures whose Σ-reduct is a model of T.

### **3 Arrays with MaxDiff**

The *McCarthy theory of arrays* [30] has three sorts ARRAY, ELEM, INDEX (called "array", "element", and "index" sort, respectively) and two function symbols rd ("read") and wr ("write") of appropriate arities; its axioms are:

$$\begin{aligned} \forall y, i, e. \; rd(wr(y, i, e), i) &= e \\ \forall y, i, j, e. \; i \neq j &\to rd(wr(y, i, e), j) &= rd(y, j). \end{aligned}$$

The McCarthy theory of *arrays with extensionality* has the further axiom

$$\forall x, y. x \neq y \to (\exists i. \; rd(x, i) \neq rd(y, i)), \tag{2}$$

called the 'extensionality' axiom. The theory of arrays with extensionality is not universal and quantifier-free interpolation fails for it [28]. In [8] a variant of the McCarthy theory of arrays with extensionality, obtained by Skolemizing the axioms of extensionality, is introduced. This variant of the theory turns out to be universal and to enjoy quantifier-free interpolation. However, the Skolem function introduced in [8] is generic, here we want to make it more informative, so as to return the biggest index where two different arrays differ. To locate our contribution in the general context, we need the notion of an index theory.

**Definition 4.** *An* index theory T<sup>I</sup> *is a mono-sorted theory (let* INDEX *be its sort) satisfying the following conditions:*


We recall that T O is the theory whose only proper symbols (beside equality) are a binary predicate ≤ and a constant 0 subject to the axioms saying that ≤ is reflexive, transitive, antisymmetric and total (the latter means that i ≤ j ∨j ≤ i holds for all i, j). Thus, the signature of an index theory T<sup>I</sup> contains at least the binary relation symbol ≤ and the constant 0. In the paper, by a T<sup>I</sup> -term, T<sup>I</sup> -atom, T<sup>I</sup> -formula, etc. we mean a term, atom, formula in the signature of T<sup>I</sup> . Below, we use the abbreviation i<j for i ≤ j ∧ i = j. The constant 0 is meant to separate 'formally positive' indexes - those satisfying 0 ≤ i - from the remaining 'formally negative' ones.

Examples of index theories are T O itself, integer difference logic IDL, integer linear arithmetic LIA, and real linear arithmetics LRA. In order to match the requirements of Definition 4, one must however make a careful choice of the language, see [9] for details: the most important detail is that integer (resp. real) division by all positive integers should be added to the language of LIA (resp. LRA). For most applications, IDL (namely the theory of integer numbers with 0, ordering, successor and predecessor) <sup>8</sup> suffices as in this theory one can model counters for scanning arrays.

<sup>8</sup> The name 'integer difference logic' comes from the fact that atoms in this theory are equivalent to formulæ of the kind Sn(i) ✶ j (where ✶ ∈ {≤, ≥, =}), thus they represent difference bound constraints of the kind j − i ✶ n for n ≥ 0.

Given an index theory T<sup>I</sup> , we now introduce our *array theory with maxdiff* ARD(T<sup>I</sup> ) (parameterized by T<sup>I</sup> ) as follows. We still have three sorts ARRAY, ELEM, INDEX; the language includes the symbols of T<sup>I</sup> , the read and write operations rd, wr, a binary function diff of type ARRAY × ARRAY → INDEX, as well as constants and ⊥ of sorts ARRAY and ELEM, respectively. The constant ⊥ models an undetermined (e.g. undefined, not-in-use, not coming from appropriate initialization, etc.) value and ε models the totally undefined array; the term diff(x, y) returns the maximum index where x and y differ and returns 0 if x and <sup>y</sup> are equal. <sup>9</sup> Formally, the axioms of ARD(T<sup>I</sup> ) include, besides the axioms of T<sup>I</sup> , the following ones:

$$\forall y, i, e. \ i \ge 0 \to rd(wr(y, i, e), i) = e \tag{3}$$

$$\forall y, i, j, e. \ i \neq j \rightarrow rd(wr(y, i, e), j) = rd(y, j) \tag{4}$$

$$\forall x, y. \ x \neq y \to r d(x, \mathtt{diff}(x, y)) \neq r d(y, \mathtt{diff}(x, y)) \tag{5}$$

$$\forall x, y, i. \ i > \mathbf{diff}(x, y) \to rd(x, i) = rd(y, i) \tag{6}$$

$$\forall x. \; \mathbf{diff}(x, x) = 0 \tag{7}$$

$$\forall x. i \; i < 0 \to rd(x, i) = \bot \tag{8}$$

$$
\forall i. \quad rd(\varepsilon, i) = \bot \tag{9}
$$

In the read-over-write axiom (3), we put the proviso i ≥ 0 because we want all our arrays to be undefined on negative indexes (negative updates makes no sense and have no effect: by axiom (8), reading a negative index always produces ⊥).

We call ARext(T<sup>I</sup> ) (the 'theory of arrays with extensionality parameterized by T<sup>I</sup> ') the theory obtained from ARD(T<sup>I</sup> ) by removing the symbol diff and by replacing the axioms (5)-(7) by the extensionality axiom (2). Since the extensionality axioms follows from axiom (5), ARD(T<sup>I</sup> ) is an extension of ARext(T<sup>I</sup> ).

As an effect of the above axioms, we have that an array x is undefined outside the interval [0, |x|], where |x| is defined as |x| := diff(x, ε). Typically, this interval is finite and in fact our proof of Theorem 3 below shows that any satisfiable constraint is satisfiable in a model where all such intervals (relatively to the variables involved in the constraint) are finite.

The next lemma is immediate from the axiomatization of ARD(T<sup>I</sup> ):

**Lemma 1.** *An atom of the form* a = b *is equivalent (modulo* ARD(T<sup>I</sup> )*) to*

$$\mathbf{diff}(a,b) = 0 \land rd(a,0) = rd(b,0) \; . \tag{10}$$

*An atom of the form* a = wr(b, i, e) *is equivalent (modulo* ARD*) to*

$$(i \ge 0 \to rd(a, i) = e) \land \forall h \ (h \ne i \to rd(a, h) = rd(b, h)) \;. \tag{11}$$

*An atom of the form* diff(a, b) = i *is equivalent (modulo* ARD(T<sup>I</sup> )*) to*

$$\land i \ge 0 \land \forall h \left( h > i \to rd(a, h) = rd(b, h) \right) \\ \land (i > 0 \to rd(a, i) \ne rd(b, i)) \\ \quad . \tag{12}$$

<sup>9</sup> Notice that it might well be the case that diff(x, y) = 0 for different x, y, but in that case 0 is the only index where x, y differ.

For our interpolation algorithm in Section 7, we need to introduce iterated diff operations, similarly to [37]. As we know diff(a, b) returns the biggest index where a and b differ (it returns 0 if a = b). Now we want an operator that returns the last-but-one index where a, b differ (0 if a, b differ in at most one index), an operator that returns the last-but-two index where a, b differ (0 is they differ in at most two indexes), etc. Our language is already enough expressive for that, so we can introduce such operators explicitly as follows. Given array variables a, b, we define by mutual recursion the sequence of array terms b1, b2,... and of index terms diff1(a, b), diff2(a, b),... :

$$\begin{aligned} b\_1 &:= b; & \mathbf{diff}\_1(a, b) &:= \mathbf{diff}(a, b\_1); \\ b\_{k+1} &:= wr(b\_k, \mathbf{diff}\_k(a, b), rd(a, \mathbf{diff}\_k(a, b))); & \mathbf{diff}\_{k+1}(a, b) &:= \mathbf{diff}(a, b\_{k+1}) \end{aligned}$$

Intuitively, b<sup>k</sup>+1 is the same as b except for all k-last indexes on which a and b differ, in correspondence of which b<sup>k</sup>+1 has the same value as a. A useful fact is that conjunctions of formulae of the kind j<l diff<sup>j</sup> (a, b) = k<sup>j</sup> can be eliminated in favor of universal clauses in a language whose only symbol for array variables is rd. In detail:

**Lemma 2.** *A formula like*

$$\mathtt{diff}\_1(a,b) = k\_1 \land \cdots \land \mathtt{diff}\_l(a,b) = k\_l \tag{13}$$

*is equivalent modulo* ARD *to the conjunction of the following five formulae:*

$$k\_1 \ge k\_2 \land \dots \land k\_{l-1} \ge k\_l \land k\_l \ge 0 \tag{14}$$

$$
\bigwedge\_{j k\_{j+1} \to rd(a, k\_j) \neq rd(b, k\_j))\tag{15}
$$

$$
\bigwedge\_{j$$

$$\bigwedge\_{j \le l} (rd(a, k\_j) = rd(b, k\_j) \to k\_j = 0) \tag{17}$$

$$\forall h \; (h > k\_l \to rd(a, h) = rd(b, h) \lor h = k\_1 \lor \dots \lor h = k\_{l-1}) \tag{18}$$

#### **3.1 Our roadmap**

The main result of the paper is that, for every index theory T<sup>I</sup> , the array theory with maxdiff ARD(T<sup>I</sup> ) indexed by T<sup>I</sup> *enjoys quantifier-free interpolation* and that *interpolants can be computed hierarchically* by relying on a black-box quantifier-free interpolation algorithm for the weaker theory T<sup>I</sup>∪EUF (the latter theory has quantifier free interpolation because T<sup>I</sup> is strongly amalgamable and because of Theorem 1). In this subsection, we supply intuitions and we give a qualitative high-level view to our proofs: more technical details and full proofs can be found in [21].

#### **The algorithm.**

By general easy transformations (recalled in Section 7 below), it is sufficient to be able to extract a quantifier-free interpolant out of a pair of quantifier-free formulae A, B such that (i) A∧B is ARD(T<sup>I</sup> )-inconsistent; (ii) both A and B are conjunctions of flat literals, i.e. of literals which are equalities between variables, disequalities between variables or literals of the form R(x), ¬R(x), f(x) = y (where x, y are variables, R is a predicate symbol and f a function symbol).

Let us call *common* the variables occurring in both A and B. The fact that a quantifier-free interpolant exists intuitively means that there are two reasoners (an A-reasoner operating on formulae involving only the variables occurring in A and a B-reasoner operating on formulae involving only the variables occurring in B) that are able to discover the inconsistency of A∧B by exchanging information on the common language, i.e. by communicating each other only the entailed quantifier-free formulae involving the common variables.

A problem that can be addressed when designing an interpolation algorithm, is that there are infinitely many common terms that can be built up out of finitely many common variables and it may happen that some uncommon terms can be recognized to be equal to some common terms during the deductions performed by the A-reasoner and the B-reasoner.

As an example, suppose that A contains the literals c<sup>1</sup> = wr(c2, i, e), c<sup>1</sup> = c2, a = wr(c3, i, e), where only c1, c2, c<sup>3</sup> are common (i.e. only these variables occur in B). Then using diff operations, we can deduce i = diff(c1, c2), e = rd(c1, i) so that in the end we can conclude that a is also 'common', being definable in term of common variables. Thus, the A-reasoner must communicate (via a defining common term or in some other indirect way) to the B-reasoner any fact it discovers about a, although a was not listed among the common variables since the very beginning. In more sophisticated examples, iterated diff operations are needed to discover 'hidden' common facts.

To cope with the above problem, our algorithm *gives names* i<sup>k</sup> = diffk(c1, c2) to all the iterated diffs of common array variables c1, c<sup>2</sup> (the newly introduced names i<sup>k</sup> are considered common and can be replaced back with their defining terms when the interpolants are computed at the end of the algorithm).

The second component of our algorithm is *instantiation*. Both the A- and the B-reasoner use the content of Lemmas 1 and 2 in order to handle atoms of the kind a = b, a<sup>1</sup> = wr(a2, i, e), i = diffk(a1, a2). Whenever they come across such atoms, the equivalent formulæ supplied by these lemmas are taken into consideration; in fact, whenever the lemmas produce universally quantified clauses of the kind ∀h C, they replace in C the universally quantified index variable h by *all possible instantiations* with their own index terms (these are the terms built up from index variables occurring in A for the A-reasoner and occurring in B for the B-reasoner respectively). Such instantiations can be read as *clauses in the language of* T<sup>I</sup> ∪EUF if we replace every array variable a by a fresh unary function symbol f<sup>a</sup> and read terms like rd(a, i) as fa(i).

Of course both the production of names for iterated diff-terms and the instantiation with owned index terms need to be repeated (possibly, infinitely many times); we prove however (this is the content of our main Theorem 4 below) that *if* A ∧ B *is* ARD(T<sup>I</sup> )*-inconsistent, then sooner or later the union of the sets of the clauses deduced by the* A*-reasoner and the* B*-reasoner in the restricted* *signature of* T<sup>I</sup> ∪ EUF *is* T<sup>I</sup> ∪ EUF*-inconsistent*, i.e., the instantiation process terminates. This means that an interpolant can be extracted, using a black-box quantifier-free interpolation algorithm for the weaker theory T<sup>I</sup> ∪ EUF. In the simple case where T<sup>I</sup> is just the theory T O of total orders, we shall prove in Section 8 that a *quadratic* number of instantiations always suffices. In the general case, however, the situation is similar to the statement of Herbrand theorem: finitely many instantiations suffice to get an inconsistency proof in the weaker logical formalism, but a bound cannot be given.

#### **The proof.**

Theorem 4 is proved in a contrapositive way: we show that *if a* T<sup>I</sup> ∪ EUF*inconsistency never arises, then* A∧B *is* ARD(T<sup>I</sup> )*-consistent*. This is proved in two steps: if T<sup>I</sup> ∪ EUF-inconsistency does not arise, we produce two ARD(T<sup>I</sup> ) models A and B, where A satisfies A and B satisfies B. Moreover, A and B are built up in such a way that they share the same ARD(T<sup>I</sup> )-substructure. In the second step, we prove the amalgamation theorem for ARD(T<sup>I</sup> ), so that the amalgamated model will produce the desired model of A ∧ B. In fact, the two steps are inverted in our exposition: we first prove the amalgamation theorem in Section 5 (Theorem 2) and then our main theorem in Section 7 (Theorem 4).

### **4 Embeddings**

We preliminarily discuss the class of models of ARD(T<sup>I</sup> ) and we make important clarifications about embeddings between such models. A model M of ARext(T<sup>I</sup> ) or of ARD(T<sup>I</sup> ) is *functional* when the following conditions are satisfied:


Because of the extensionality axiom, it can be shown that every model is *isomorphic to a functional one*. For an array a ∈ INDEX<sup>M</sup> in a functional model M and for i ∈ INDEXM, since a is a function, we interchangeably use the notations a(i) and rd(a, i). A functional model M is said to be *full* iff ARRAY<sup>M</sup> consists of *all* the positive-support functions from INDEX<sup>M</sup> to ELEMM.

Let a, b be elements of ARRAY<sup>M</sup> in a model M. We say that a *and* b *are cardinality dependent* (in symbols, M |= a − b < ω) iff {i ∈ INDEX<sup>M</sup> |M|= rd(a, i) = rd(b, i)} is finite. Cardinality dependency in M is obviously an equivalence relation, that we sometimes denote as ∼M.

Passing to ARD(T<sup>I</sup> ), a further remark is in order: in a functional model M of ARD(T<sup>I</sup> ), the index diff(a, b) (if it exists) is uniquely determined: it must be the maximum index where a, b differ (it is 0 if a = b). We say that diff(a, b) is *defined* iff there is a maximum index where a, b differ (or if a = b). An embedding μ : M −→ N between ARext(T<sup>I</sup> )-models is said to be difffaithful iff whenever diff(a, b) is defined so is diff(μ(a), μ(b)) and it is equal to μ(diff(a, b)). Since there might not be a maximum index where a, b differ, in principle it is not always possible to expand a functional model of ARext(T<sup>I</sup> ) to a functional model of ARD(T<sup>I</sup> ), keeping the set of indexes unchanged. Indeed, in order to do that in a diff-faithful way, one needs to explicitly add to INDEX<sup>M</sup> new indexes including at least indexes representing the missing maximum indexes where two given array differ. This idea is used in the following lemma (proved in the online available extended version [21]):

**Lemma 3.** *For every index theory* T<sup>I</sup> *, every model of* ARext(T<sup>I</sup> ) *has a* diff*faithful embedding into a model of* ARD(T<sup>I</sup> )*.*

### **5 Amalgamation**

We now sketch the proof of the amalgamation property for ARD(T<sup>I</sup> ). We recall that strong amalgamation holds for models of T<sup>I</sup> (see Definition 4).

**Theorem 2.** ARD(T<sup>I</sup> ) *enjoys the amalgamation property.*

*Proof.* Take two embeddings μ<sup>1</sup> : N −→ M<sup>1</sup> and μ<sup>2</sup> : N −→ M2. As we know, we can suppose—w.l.o.g.—that N ,M1,M<sup>2</sup> are functional models; in addition, via suitable renamings, we can freely suppose that μ1, μ<sup>2</sup> restricts to inclusions for the sorts INDEX and ELEM, and that (ELEMM<sup>1</sup> \ELEM<sup>N</sup> )∩(ELEMM<sup>2</sup> \ELEM<sup>N</sup> ) = ∅, (INDEXM<sup>1</sup> \INDEX<sup>N</sup> )∩(INDEXM<sup>2</sup> \INDEX<sup>N</sup> ) = ∅. To build the amalgamated model of ARD(T<sup>I</sup> ), we first build a full model M of ARext(T<sup>I</sup> ) with diff-faithful embeddings ν<sup>1</sup> : M<sup>1</sup> −→ M and ν<sup>2</sup> : M<sup>2</sup> −→ M such that ν<sup>1</sup> ◦ μ<sup>1</sup> = ν<sup>2</sup> ◦ μ2. If we succeed, the claim follows by Lemma 3: indeed, thanks to that lemma, we can embed in a diff-faithful way M (which is a model of ARext(T<sup>I</sup> )) to a model M of ARD(T<sup>I</sup> ), which is the required ARD(T<sup>I</sup> )-amalgam.

We take the T<sup>I</sup> -reduct of M to be a model supplied by the strong amalgamation property of T<sup>I</sup> (again, we can freely assume that the T<sup>I</sup> -reducts of M1,M<sup>2</sup> identically include in it); we let ELEM<sup>M</sup> to be ELEMM<sup>1</sup> ∪ ELEMM<sup>2</sup> . We need to define ν<sup>i</sup> : M<sup>i</sup> −→ M (i = 1, 2) in such a way that ν<sup>i</sup> is diff-faithful and ν<sup>1</sup> ◦μ<sup>1</sup> = ν<sup>2</sup> ◦μ2. We take the INDEX and the ELEM-components of ν1, ν<sup>2</sup> to be just identical inclusions. The only relevant point is the action of ν<sup>i</sup> on ARRAYM<sup>i</sup> : since we have strong amalgamation for indexes, in order to define it, it is sufficient to extend any a ∈ ARRAYM<sup>i</sup> to all the indexes k ∈ (INDEX<sup>M</sup> \ INDEXM<sup>i</sup> ). For indexes k ∈ (INDEX<sup>M</sup> \ (INDEXM<sup>1</sup> ∪ INDEXM<sup>2</sup> )) we can just put νi(a)(k) = ⊥. If k ∈ (INDEX<sup>M</sup> \ INDEXM<sup>i</sup> ) and k ∈ (INDEXM<sup>1</sup> ∪ INDEXM<sup>2</sup> ), then k ∈ (INDEXM3−<sup>i</sup> \ INDEX<sup>N</sup> ); the definition for such k is as follows:

(\*) we let νi(a)(k) be equal to μ<sup>3</sup>−<sup>i</sup>(c)(k), where c is any array c ∈ ARRAY<sup>N</sup> for which there is a ∈ ARRAYM<sup>i</sup> such that a ∼M<sup>i</sup> a and such that the relation k > diffM<sup>i</sup> (a , μi(c)) holds in INDEXM; <sup>10</sup> if such c does not exist, then we put νi(a)(k) = ⊥.

<sup>10</sup> This should be properly written as k>νi(diffM<sup>i</sup> (a- , μi(c))), however recall that the INDEX-component of <sup>ν</sup>i is identity, so the simplified notation is nevertheless correct.

Definition (\*) is forced by some constraints that νi(a)(k) must satisfy. Of course, definition (\*) itself needs to be justified: besides showing that it enjoys the required properties, we must also prove that it is well-given (i.e. that it does not depend on the selected c and a ). It is easy to see that, if the definition is correct, then we have ν<sup>1</sup> ◦ μ<sup>1</sup> = ν<sup>2</sup> ◦ μ2; also, it is clear that ν<sup>i</sup> preserves read and write operations (hence, it is a homomorphism) and is injective. For (i) justifying the definition of ν<sup>i</sup> and (ii) showing that it is also diff-faithful, we need to show the following two claims (the proof is not easy, see the extended version [21] for details) for arrays a1, a<sup>2</sup> ∈ ARRAY<sup>M</sup> <sup>1</sup> , for an index k ∈ (INDEXM<sup>2</sup> \ INDEX<sup>N</sup> ) and for arrays c1, c<sup>2</sup> ∈ ARRAY<sup>N</sup> (checking the same facts in M<sup>2</sup> is symmetrical):


### **6 Satisfiability**

The key step of the interpolation algorithm that will be proposed in Section 7 depends upon the problem of checking satisfiability (modulo ARD(T<sup>I</sup> )) of quantifier-free formulæ; this will be solved in the present section by adapting instantiation techniques, like those from [7].

We define the *complexity* c(t) of a term t as the number of function symbols occurring in t (thus variables and constants have complexity 0). A *flat* literal L is a formula of the kind x<sup>1</sup> = t or x<sup>1</sup> = x<sup>2</sup> or R(x1,...,xn) or ¬R(x1,...,xn), where the x<sup>i</sup> are variables, R is a relation symbol, and t is a term of complexity less or equal to 1. If I is a set of T<sup>I</sup> -terms, an I*-instance* of a universal formula of the kind ∀i φ is a formula of the kind φ(t/i) for some t ∈ I.

A pair of sets of quantifier-free formulae Φ = (Φ1, Φ2) is a *separated pair* iff (1) Φ<sup>1</sup> contains equalities of the form diffk(a, b) = i and a = wr(b, i, e); moreover if it contains the equality diffk(a, b) = i, it must also contain an equality of the form diffl(a, b) = j for every l<k;

(2) Φ<sup>2</sup> contains Boolean combinations of T<sup>I</sup> -atoms and of atoms of the forms:

$$rd(a, i) = rd(b, j), \quad rd(a, i) = e, \quad e\_1 = e\_2,\tag{19}$$

where a, b, i, j, e, e1, e<sup>2</sup> are variables or constants of the appropriate sorts. The separated pair is said to be finite iff Φ<sup>1</sup> and Φ<sup>2</sup> are both finite.

In practice, in a separated pair Φ = (Φ1, Φ2), reading rd(a, i) as a functional application, it turns out that *the formulæ from* Φ<sup>2</sup> *can be translated into quantifier-free formulæ of the combined theory* T<sup>I</sup> ∪ EUF (the array variables occurring in Φ<sup>2</sup> are converted into free unary function symbols). T<sup>I</sup> ∪EUF enjoys the decidability of the quantifier-free fragment and has quantifier-free interpolation because T<sup>I</sup> is an index theory (see Nelson-Oppen results [33] and Theorem 1): we adopt a hierarchical approach (similar to [35,36]) and *we rely on satisfiability and interpolation algorithms for such a theory as black boxes*.

Let I be a set of T<sup>I</sup> -terms and let Φ = (Φ1, Φ2) be a separated pair; we let Φ(I)=(Φ1(I), Φ2(I)) be the smallest separated pair satisfying the following conditions:


$$\forall i \; rd(\varepsilon, i) = \bot, \; \forall i \; (i < 0 \to rd(a, i) = \bot),$$

where a is any array variable occurring in Φ<sup>1</sup> or Φ2;


For <sup>M</sup> <sup>∈</sup> <sup>N</sup> ∪ {∞}, the <sup>M</sup>*-instantiation* of <sup>Φ</sup> = (Φ1, Φ2) is the separated pair <sup>Φ</sup>(I<sup>M</sup> <sup>Φ</sup> )=(Φ1(I<sup>M</sup> <sup>Φ</sup> ), Φ2(I<sup>M</sup> <sup>Φ</sup> )), where <sup>I</sup><sup>M</sup> <sup>Φ</sup> is the set of T<sup>I</sup> -terms of complexity at most M built up from the index variables occurring in Φ1, Φ2. The *full instantiation* of Φ = (Φ1, Φ2) is the separated pair Φ(I<sup>∞</sup> <sup>Φ</sup> )=(Φ1(I<sup>∞</sup> <sup>Φ</sup> ), Φ2(I<sup>∞</sup> <sup>Φ</sup> )) (which is usually not finite). A separated pair Φ = (Φ1, Φ2) is M*-instantiated* iff <sup>Φ</sup> <sup>=</sup> <sup>Φ</sup>(I<sup>M</sup> <sup>Φ</sup> ); it is ARD(T<sup>I</sup> )-satisfiable iff so it is the formula Φ<sup>1</sup> ∧ Φ<sup>2</sup> 11

*Example 1. Let* Φ<sup>1</sup> *contain the four atoms*

$$\{ \text{ diff}\mathbf{f}(a, c\_1) = i\_1, \text{ diff}\mathbf{f}(b, c\_2) = i\_1, \ a = wr(a\_1, i\_3, e\_3), \ a\_1 = wr(b, i\_1, e\_1) \}$$

*and let* Φ<sup>2</sup> *be empty. Then* (Φ1, Φ2) *is a separated pair; 0-instantiating it adds to* Φ<sup>2</sup> *the following formulae (we delete those which are redundant)*

$$\begin{aligned} i\_1 &\ge 0\\ rd(a, i\_1) &= rd(c\_1, i\_1) \to i\_1 = 0\\ i\_3 &> i\_1 \to rd(a, i\_3) = rd(c\_1, i\_3) \\ i\_3 &\ge 0 \to rd(a, i\_3) = e\_3 & i\_1 \ge 0 \to rd(a\_1, i\_1) = e\_1\\ i\_1 &\ne i\_3 \to rd(a, i\_1) = rd(a\_1, i\_1) \end{aligned} \qquad \begin{aligned} rd(b, i\_1) &= rd(c\_2, i\_1) \to i\_1 = 0\\ i\_3 &> i\_1 \to rd(b, i\_3) = rd(c\_2, i\_3) \\ i\_1 &\ge 0 \to rd(a\_1, i\_1) = e\_1\\ i\_1 &\ne i\_3 \to rd(a\_1, i\_3) = rd(b, i\_3) \end{aligned}$$

The following results are proved in the extended version [21]:

**Lemma 4.** *Let* φ *be a quantifier-free formula; then it is possible to compute finitely many finite separation pairs* Φ<sup>1</sup> = (Φ<sup>1</sup> 1, Φ<sup>1</sup> 2),...,Φ<sup>n</sup> = (Φ<sup>n</sup> <sup>1</sup> , Φ<sup>n</sup> <sup>2</sup> ) *such that* <sup>φ</sup> *is* ARD(T<sup>I</sup> )*-satisfiable iff so is one of the* <sup>Φ</sup><sup>i</sup> *.*

**Lemma 5.** *The following conditions are equivalent for a finite separation pair* Φ = (Φ1, Φ2)*:*


**Theorem 3.** *The* SMT(ARD(T<sup>I</sup> )) *problem is decidable for every index theory* T<sup>I</sup> *(i.e. for every theory satisfying Definition 4).*

<sup>11</sup> This might be an infinitary formula if Φ is not finite. In such a case, satisfiability obviously means that there is a model M where we can assign values to all variables occurring in the formulæ from Φ<sup>1</sup> ∪ Φ<sup>2</sup> in such a way that such formulæ become simultaneously true.

Concerning the complexity of the above procedure, notice that the satisfiability of the quantifier-free fragment of common index theories (like IDL, LIA, LRA) is decidable in NP; as a consequence, from the above proof we get (for such index theories) also an NP bound for our SMT(ARD(T<sup>I</sup> )))-problems because 0-instantiation is clearly finite and polynomial. The fact that 0-instantiation suffices is a common feature of the above satisfiability procedure and of the satisfiability procedures from [7]. Unfortunately, when coming to interpolation algorithms in the next section, there is no evidence that 0-instantiation suffices.

### **7 An interpolation algorithm**

Since amalgamation is equivalent to quantifier-free interpolation for universal theories like ARD(T<sup>I</sup> ) (see Theorem 1), Theorem 2 ensures that ARD(T<sup>I</sup> ) has the quantifier-free interpolation property. However, the proof of Theorem 2 is not constructive, so in order to compute an interpolant for an ARD(T<sup>I</sup> )-unsatisfiable conjunction like ψ(x, y) ∧ φ(y, z), one should enumerate all quantifier-free formulæ θ(y) which are logical consequences of φ and are inconsistent with ψ (modulo ARD(T<sup>I</sup> )). Since the quantifier-free fragment of ARD(T<sup>I</sup> ) is decidable by Theorem 3, this is an effective procedure and, since interpolants of jointly unsatisfiable pairs of formulæ exist, it also terminates. However, such kind of an algorithm is not practical.

In this section, we improve the situation by supplying a better algorithm based on instantiation (`a-la-Herbrand). In the next section, using the results of the present section, for the special case where T<sup>I</sup> is just the theory of linear orders, we identify a complexity bound for this algorithm.

Our problem is the following: given two quantifier-free formulae A and B such that A ∧ B is not satisfiable (modulo ARD(T<sup>I</sup> )), to compute a quantifierfree formula C such that ARD(T<sup>I</sup> ) |= A → C, ARD(T<sup>I</sup> ) |= C ∧ B → ⊥ and such that C contains only the variables (of sort INDEX, ARRAY, ELEM) which occur both in A and in B.

We call the variables occurring in both A and B *common variables*, whereas the variables occurring in A (resp. in B) are called A*-variables* (resp. B*variables*). The same terminology applies to terms, atoms and formulae: e.g., a term t is an A-term (B-term, common term) iff it is built up from A-variables (B-variables, common variables, resp.).

The following operations can be freely performed (see [9] or [8] for details):


(v) conjoin B with some quantifier-free B-formula which is implied (modulo ARD(T<sup>I</sup> )) by B.

Operations (i)-(v) either add logical consequences or explicit definitions that can be eliminated (if desired) after the final computation of the interpolant. In addition, notice that if A is the form A ∨ A (resp. B is of the form B ∨ B) then from interpolants of A ∧ B and A ∧ B (resp. of A ∧ B and A ∧ B), we can recover an interpolant of A ∧ B by taking disjunction (resp. conjunction).

Because of the above remarks, using the procedure in the proof of Lemma 4, both A and B are assumed to be given in the form of finite separated pairs. Thus A is of the form A<sup>1</sup> ∧ A2, B is of the form B<sup>1</sup> ∧ B2, for separated pairs (A1, A2) and (B1, B2). Also, by (iv)-(v) above, A and B are assumed to be both 0-instantiated. We call A (resp. B) the separated pair (A1, A2) (resp. (B1, B2)). We also use the letters A1, A2, B1, B<sup>2</sup> both for sets of formulae and for the corresponding conjunctions; similarly, A represent both the pair (A1, A2) and the conjunction A<sup>1</sup> ∧ A<sup>2</sup> (and similarly for B).

The formulæ from A<sup>2</sup> and B<sup>2</sup> are formulæ from the signature of T<sup>I</sup> ∪ EUF (after rewriting terms of the kind rd(a, i) to fa(i), where the f<sup>a</sup> are free function symbols). Of course, if A2∧B<sup>2</sup> is T<sup>I</sup>∪EUF-inconsistent, *we can get our quantifierfree interpolant by using our black box algorithm for interpolation in the weaker theory* T<sup>I</sup>∪EUF: recall that T<sup>I</sup>∪EUF has quantifier-free interpolation because T<sup>I</sup> is an index theory and for Theorem 1. The remarkable fact is that A2∧B<sup>2</sup> always becomes T<sup>I</sup> ∪EUF-inconsistent if *sufficiently many* diff*s among common array variables are introduced* and *sufficiently many instantiations are performed*.

Formally, we shall *apply the loop below until* A2∧B<sup>2</sup> *becomes inconsistent*: the loop is justified by (i)-(v) above and Theorem 4 guarantees that A<sup>2</sup> ∧B<sup>2</sup> eventually becomes inconsistent modulo T<sup>I</sup> ∪EUF, if A ∧ B was originally inconsistent modulo ARD(T<sup>I</sup> ). When A2∧B<sup>2</sup> becomes inconsistent modulo T<sup>I</sup>∪EUF, we can get our interpolant using the interpolation algorithm for T<sup>I</sup> ∪ EUF. [Of course, in the interpolant returned by T<sup>I</sup> ∪ EUF, the extra variables introduced by the explicit definitions from (iii) above need to be eliminated.] We need a counter M recording how many times the Loop below has been executed (initially M = 0).

**Loop** *(to be repeated until* A<sup>2</sup> ∧ B<sup>2</sup> *becomes inconsistent modulo* T<sup>I</sup> ∪ EUF*). Pick two distinct common* ARRAY*-variables* c1, c<sup>2</sup> *and* n ≥ 1 *and s.t. no conjunct of the kind* diffn(c1, c2) = k *occurs in both* A<sup>1</sup> *and* B<sup>1</sup> *for some* n ≥ 1 *(but s.t. for every* l<n *there is a conjunct of the form* diffl(a, b) = k *occurring in both* A<sup>1</sup> *and* B1*). Pick also a fresh* INDEX *constant* kn*; conjoin* diffn(c1, c2) = k<sup>n</sup> *to both* A<sup>1</sup> *and* B1*; then* M*-instantiate both* A *and* B*. Increase* M *to* M + 1*.*

Notice that the fresh index constants k<sup>n</sup> introduced during the loop are considered common constants (they come from explicit definitions like (iii) above) and so they are considered in the M-instantiation of both A and B.

*Example 2. Let* A *be the formula* Φ<sup>1</sup> *from Example 1 and let* B *be*

i<sup>1</sup> < i<sup>2</sup> ∧ i<sup>2</sup> < i<sup>3</sup> ∧ rd(c1, i2) = rd(c2, i2)

B *is 0-instantiated; 0-instantiating* A *produces the formulæ shown in Example 1. The loop needs to be executed twice; it adds the literals* diff0(c1, c2) = k0, diff1(c1, c2) = k1*; 0-instantiation produces formulae* A2*,* B<sup>2</sup> *whose conjunction is* T<sup>I</sup> ∪EUF*-inconsistent (inconsistency can be tested via an SMT-solver like* z3 *or* MathSat*, see the ongoing implementation [1]). The related* T<sup>I</sup> ∪ EUF*interpolant (once* k<sup>0</sup> *and* k<sup>1</sup> *are replaced by* diff0(c1, c2) *and* diff1(c1, c2)*, respectively) gives our* ARD(T<sup>I</sup> )*-interpolant.* -

### **Theorem 4.** *If* A∧B *is* ARD(T<sup>I</sup> )*-inconsistent, then the above loop terminates.*

*Proof.* Suppose that the loop does not terminate and let A = (A 1, A <sup>2</sup>) and B = (B 1, B <sup>2</sup>) be the separated pairs obtained after infinitely many executions of the loop (they are the union of the pairs obtained in each step). Notice that both A and B are fully instantiated.<sup>12</sup> We claim that (A , B ) is ARD(T<sup>I</sup> )-consistent (contradicting the assumption that (A, B) was already ARD(T<sup>I</sup> )-inconsistent).

Since no contradiction was found, by compactness of first-order logic, A 2∪B 2 has a T<sup>I</sup> ∪ EUF-model M (below we treat index and element variables occurring in A, B as free constants and the array variables occurring in A, B as free unary function symbols). M is a two-sorted structure (the sorts are INDEX and ELEM) endowed for every array variable a occurring in A, B of a function a<sup>M</sup> : INDEX<sup>M</sup> −→ ELEMM. In addition, INDEX<sup>M</sup> is a model of T<sup>I</sup> . We build three ARD(T<sup>I</sup> )-structures A, B, C and two embeddings μ<sup>1</sup> : C −→ A, μ<sup>2</sup> : C −→ B such that A |= A , B |= B and such that for every common variable x we have μ1(xC) = x<sup>A</sup> and μ2(xC) = xB. The consistency of A ∪ B then follows from the amalgamation Theorem 2. The two structures A, B are obtained by taking the full functional model induced by the restriction of M to the interpretation of A-terms and B-terms (respectively) of sort INDEX, ELEM and then by applying Lemma 3; the construction of C requires some subtleties, to be detailed in the extended version [21], where the full proof of the theorem is provided. -

#### **8 When indexes are just a total order**

Comparing the results from Sections 7 and 6, a striking difference emerges: whereas variable and constant instantiations are sufficient for satisfiability checking, our interpolation algorithm requires full instantiation over all common terms. Such a full instantiation might be quite impractical, especially in index theories like LIA and LRA (it is less annoying in theories like IDL: here all terms are of the kind S<sup>n</sup>(x) or P <sup>n</sup>(x), where x is a variable or 0 and S, P are the successor and the predecessor functions). The problem disappears in simpler theories like the theory of linear orders T O, where all terms are variables (or the constant 0). Still, even in the case of T O, the proof of Theorem 4 does not give a bound for termination of the interpolation algorithm: we know that sooner or later an inconsistency will occur, but we do not know how many times we need to execute the main loop. We now improve the proof of Theorem 4 by supplying the missing bound. In this section, the index theory is fixed to be T O and we abbreviate ARD(T O) as ARD. The full proof of the theorem below is in [21].

<sup>12</sup> On the other hand, the joined pair (A- <sup>1</sup> ∪ B- 1, A- <sup>2</sup> ∪ B- 2) is not even 0-instantiated.

**Theorem 5.** *If* A ∧ B *is inconsistent modulo* ARD*, then the above loop terminates in at most* ( <sup>m</sup>2−<sup>m</sup> <sup>2</sup> ) · (n + 1) *steps, where* n *is the number of the index variables occurring in* A, B *and* m *is the number of the common array variables.*

*Proof.* We sketch a proof of the theorem: the idea is that if after N := ( <sup>m</sup>2−<sup>m</sup> <sup>2</sup> )· (n+1) steps no inconsistency occurs, then we can run the algorithm for infinitely many further steps without finding an inconsistency either. Let A<sup>N</sup> = (A<sup>N</sup> <sup>1</sup> , A<sup>N</sup> 2 ) and B<sup>N</sup> = (B<sup>N</sup> <sup>1</sup> , B<sup>N</sup> <sup>2</sup> ) be obtained after N-executions of the loop and let M be a T O∪EUF-model of <sup>A</sup><sup>N</sup> <sup>2</sup> <sup>∧</sup>B<sup>N</sup> <sup>2</sup> . Fix a pair of distinct common array variables c1, c<sup>2</sup> to be handled in Step N + 1; since all pairs of common array variables have been examined in a fair way, A<sup>N</sup> <sup>1</sup> and <sup>B</sup><sup>N</sup> <sup>1</sup> contain the atom diff<sup>n</sup>+1(c1, c2) = k<sup>n</sup>+1 (in fact N := ( <sup>m</sup>2−<sup>m</sup> <sup>2</sup> ) · (<sup>n</sup> + 1) and ( <sup>m</sup>2−<sup>m</sup> <sup>2</sup> ) is the number of distinct unordered pairs of common array variables, so the pair (c1, c2) has been examined more than n times). In M, some index variable k<sup>l</sup> for l ≤ k<sup>n</sup>+1, if not assigned to 0, is assigned to an element x which is different from the elements assigned to the n variables occurring in A, B. This allows us to enlarge M to a superstructure which is a model of A<sup>N</sup>+1 <sup>2</sup> <sup>∧</sup> <sup>B</sup><sup>N</sup>+1 <sup>2</sup> by 'duplicating' x. Continuing in this way, we produce a chain of T O ∪ EUF-models witnessing that we can run infinitely many steps of the algorithm without finding an inconsistency. -

#### **9 Conclusions and further work**

We studied an extension of McCarthy theory of arrays with a maxdiff symbol. This symbol produces a much more expressive theory than the theory of plain diff symbol already considered in the literature [8,37].

We have also considered another strong enrichment, namely the combination with arithmetic theories like IDL,LIA,LRA,... (all such theories are encompassed by the general notion of an 'index theory'). Such a combination is non trivial because it is a non disjoint combination (the ordering relation is in the shared signature) and does not fulfill the T0-compatibility requirements of [17,19,18] needed in order to modularly import satisfiability and interpolation algorithms from the component theories.

The above enrichments come with a substantial cost: although decidability of satisfiability of quantifier-free formulae is not difficult to obtain, quantifierfree interpolation becomes challenging. In this paper, we proved that quantifierfree interpolants indeed do exist: the interpolation algorithm is indeed rather simple, but its justification comes via a complicated d´etour involving semantic investigations on amalgamation properties.

The interpolation algorithm is based on hierarchic reduction to general quantifier-free interpolation in the index theory. The reduction requires the introduction of iterated diff terms and a finite number of instantiations of the universal clauses associated to write and diff-atoms. For the simple case where the index theory is just the theory of total orders, we were able to polynomially bound the depth of the iterated diff terms to be introduced as well as the number of instantiations needed. The main open problem we leave for future is the determination of analogous bounds for richer index theories.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Adjoint Reactive GUI Programming**

Christian Uldal Graulund<sup>1</sup> [-], Dmitrij Szamozvancev<sup>2</sup>, and Neel Krishnaswami<sup>2</sup>

<sup>1</sup> IT University of Copenhagen, 2300 Copenhagen, DK cgra@itu.dk <sup>2</sup> University of Cambridge, Cambridge CB3 0FD, UK nk480@cl.cam.ac.uk,ds709@cl.cam.ac.uk

**Abstract.** Most interaction with a computer is via graphical user interfaces. These are traditionally implemented imperatively, using shared mutable state and callbacks. This is efficient, but is also difficult to reason about and error prone. Functional Reactive Programming (FRP) provides an elegant alternative which allows GUIs to be designed in a declarative fashion. However, most FRP languages are synchronous and continually check for new data. This means that an FRP-style GUI will "wake up" on each program cycle. This is problematic for applications like text editors and browsers, where often nothing happens for extended periods of time, and we want the implementation to sleep until new data arrives. In this paper, we present an asynchronous FRP language for designing GUIs called λ<sup>W</sup>idget. Our language provides a novel semantics for widgets, the building block of GUIs, which offers both a natural Curry– Howard logical interpretation and an efficient implementation strategy.

**Keywords:** Linear Types · FRP · Asynchrony · GUIs

#### **Introduction**

Many programs, like compilers, can be thought of as functions – they take a single input (a source file) and then produce an output (such as a type error message). Other programs, like embedded controllers, video games, and integrated development environments (IDEs), engage in a dialogue with their environment: they receive an input, produce an output, and then wait for a new input that depends on the prior input, and produce a new output which is in turn potentially based on the whole history of prior inputs.

The usual techniques for programming interactive applications are often confusing, since different parts of the program are not written to interact via structured control flow (e.g., by passing and return values from functions). Instead, they communicate indirectly, via state-manipulating callbacks which are implicitly invoked by an event loop. This makes program reasoning very challenging, since each of aliased mutable state, higher-order functions, and concurrency is tricky on its own, and interactive programs rely upon their *combination*.

This challenge has led to a great deal of work on better abstractions for programming reactive systems. Two of the main lines of work on this problem are *synchronous dataflow* and *functional reactive programming*. The synchronous

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 289–309, 2021. https://doi.org/10.1007/978-3-030-71995-1 15

dataflow languages, like Esterel [5], Lustre [9], and Lucid Synchrone [28], feature a programming model inspired by Kahn networks. Programs are networks of stream-processing nodes which communicate with each other, each node consuming and producing a fixed number of primitive values at each clock tick. The first-order nature of these languages makes them strongly analysable, which lets them offer powerful guarantees on space and time usage. This means they see substantial use in embedded and safety-critical contexts.

Functional reactive programming, introduced by Elliott and Hudak [13], also uses time-indexed values, dubbed signals, rather than mutable state as its basic primitive. However, FRP differs from synchronous dataflow by sacrificing static analysability in favour of a much richer programming model. Signals are true first-class values, and can be used freely, including in higher-order functions and signal-valued signals. This permits writing programs with a dynamicallyvarying dataflow network, which simplifies writing programs (such as GUIs) in which the available signals can change as the program executes. Over the past decade, a long line of work has refined FRP via the Curry–Howard correspondence [21,18,17,19,20,10,1]. This approach views functional reactive programs as the programming counterpart for proofs of formulas in linear temporal logic [27], and has enabled the design of calculi which can rule out spacetime leaks [20] or can enforce temporal safety and liveness properties [10].

However, both synchronous dataflow and FRP (in both original and modal flavours) have a *synchronous* (or "pull") model of time – time passes in ticks, and the program wakes up on every tick to do a little bit more computation. This is suitable for applications in which something new happens at every time step (e.g., video games), but many GUI programs like text editors and spreadsheets spend most of their time doing nothing. That is, even at each event, most of the program will continue doing nothing, and we only want to wake up a component when an event directly relevant to it occurs. This is important both from a performance point of view, as well as for saving energy (and extending battery life). Because of this need, most GUI programs continue to be written in the traditional callbacks-on-mutable-state style.

In this paper, we give a reactive programming language whose type system both has a very straightforward logical reading, and which can give natural types to stateful widgets and the event-based programming model they encourage. We also derive a denotational semantics of the language, by first working out a semantics of widgets in terms of the operations that can be performed upon them and the behaviour they should exhibit. Then, we find the categorical setting in which the widget semantics should live, and by studying the structure this setup has, we are able to interpret all of the other types of the programming language.

*Contributions* The contributions of this paper are:

**–** We give a descriptive semantics for widgets in GUI programming, and show that this semantics correctly models a variety of expected behaviours. For example, our semantics shows that a widget which is periodically re-set to the colour red is different from a widget that was only persistently set to the colour red at the first timestep. Our semantic model can show that as long as neither one is updated, they look the same, but that they differ if they are ever set to blue – the first will return to red at reset time, and the second will remain blue.


#### **The Language**

We now present λ<sup>W</sup>idget through the API of the Widget type. This API mirrors how one would work with a GUI at the browser level. An important feature of a well-designed GUI is that it should not do anything when not in use. In particular, it should not check for new inputs in each program cycle (*pull*-based reactive programming), but rather sleep until new data arrives (*push*-based reactive programming). Many FRP languages are *synchronous* languages and have some internal notion of a timestep. These languages are mostly pull-based, whereas more traditional imperative reactive languages are push-based. The former have clear semantics and are easy to reason about, the latter have efficient implementations. In λ<sup>W</sup>idget we would like to combine these aspects and get a language that is easy to reason about with an efficient implementation.

In general, we think of a widget as a *state through time*, i.e., at each timestep, the widget is in some state which is presented to the user. The widget is modified by *commands*, which can update the state. To program with widgets, the programmer applies commands at various times.

The proper type system for a language of widgets should thus be a system with both state and time. If we consider what a *logic* for widgets should be, there are two obvious choices. A logic for state is linear logic [14], and a logic for time is linear temporal logic [27]. The combination of these two is the correct setting for a language of widgets, and, going through Curry–Howard, the corresponding type theory is a linear, linear temporal type theory.

*Widget API* To work with widgets, we define a API which mirrors how one would work with a browser level GUI:

```
newWidget : I  ∃ (i : Id), Widget i
dropWidget : ∀ (i : Id), Widget i  I
setColor : ∀ (i : Id), F Color ⊗ Widget i  Widget i
onClick : ∀ (i : Id), Widget i  Widget i ⊗ -
                                             I
onKeypress : ∀ (i : Id), Widget i  Widget i ⊗ -
                                             (F Char)
out : -
             A  ∃ (n : Time), A @ n
into : ∃ (n : Time), A @ n  -
                                  A
split : ∀ (i : Id) (t : Time), Widget i  Prefix i t ⊗ (Widget i) @ t
join : ∀ (i : Id) (t : Time), Prefix i t ⊗ (Widget i) @ t  Widget i
```
The first two commands creates and deletes widgets, respectively. The should be understood as *state passing*. We read the type of newWidget as "consuming no state, produce a new identifier index and a widget with that identifier index". The identifier indices are used to ensure the correct behavior when using the split and join commands explained below. The existential quantification describes the *non-deterministic* creation of an identifier index. The use of non-determinism is crucial in our language and will be explaining in further detail in section 1. Since λ<sup>W</sup>idget has a linear type system, we need an explicit construction to delete state. For widgets, this is dropWidget. The type is read as "for any identifier index, consume a widget with that identifier index and produce nothing".

The first command that modifies the state of a widget is setColor. Here we see the adjoint nature of the calculus with F Color. A color is itself *not* a linear thing, and as such, to use it in the linear setting, we apply F, which moves from the non-linear (Cartesian) fragment and into the linear fragment. The second new thing is the linear product ⊗. This differs from the regular non-linear product in that we do not have projection maps. Again, because of the linearity of our language, we cannot just discard state. We can now read the type of setColor as "Given a color and a identified widget, consume both and produce a new widget". The produced widget is the same as the consumed widget, but with the color attribute updated.

The next two commands, onClick and onKeypress, are roughly similar. Both register a handle on the widget, for a mouse click and a key press, respectively. Here we see the first use of the modality, which represents an *event*. The type -A represents that *at some point in the future* we will receive something of type A. Importantly, because of the asynchronous nature of λ<sup>W</sup>idget, we do not know *when* it happens. We can then read the type of onClick as "Consuming an identified widget, produce an updated widget together with a mouse click event". The same holds for onKeypress except a key press event is produced.

The two commands out and into allows us to work with events in a more precise way. Given an event, we can use out to "unfold" it into an existential. The @ connective describes a type that is only available at a certain timestep, i.e., A @ n means "at the timestep n, a term of type A will be available". The into commands is the reverse of out and turns an existential and an @ into an event.

Note the besides the above ways of constructing events, we can also turn any value into an event using the evt construction which is part of the core calculus. Given some element <sup>a</sup> : <sup>A</sup>, we get evt <sup>a</sup> : -A which represents the event that returns immediately.

So far, we have only applied commands to a widget in the current timestep, but to program appropriately with widgets, we should be able to react to events and apply commands "in the future". This is exactly what the split and join commands allows us to do. The type of split is read as "Given any time step and any identified widget, split the widget into all the states *before* that time and the widget *at* that time". We denote the collection of states before a given time a *prefix* and give it the type Prefix. Given the state of the widget at a given timestep, we can now apply commands *at that timestep*. Note that both the prefix and the widget is indexed by the same identifier index. This is to ensure that when we use join, we combined the correct prefix and future.

*Widget Programming* To see the API in action, we now proceed with several examples of widget programming. For each example, we will add a comment on each line with the type of variables, and then explain the example in text afterwards.

One of the simplest things we can do with a widget is to perform some action when the widget is clicked. In the following example, we register a handler for mouse clicks, and then we use the click event to change the color of the widget to red at the time of the click. To do this, we use the out map to get the time of the event, then we split the widget and apply setColor at that point in the future.

```
1 turnRedOnClick : ∀ (i : Id), Widget i  Widget i
2 turnRedOnClick i w0 =
3 let (w1, c0) = onClick i w0 in -- w1 : Widget i, c0 : -
                                                     I
4 let unpack (x , c1) = out c0 in -- x : Time, c1 : I @ x
5 let c2 @ x = c1 in -- c2 : I at x
6 let  @ x = c2 in
7 let (p, w2) = split i x w1 in -- p : Prefix i x, w2 : Widget i @ x
8 let w3 @ x = w2 in -- w3 : Widget i at time x
9 let w4 = -- w4 : Widget i @ x
10 (setColor (F Red) w3) @ x in
11 join i x (p, w4)
```
To see why this type checks, we go through the example line by line. In line 3, we register a handle for a mouse click on the widget. In line 4, we turn the click event into an existential. In line 5, we get c<sup>2</sup> which is a binding that is only available at the timestep x. Since we only need the *time* of the click, we discharge the click itself in line 6. In line 7 and 8, we split the widget using the timestep x and bind w<sup>3</sup> to the state of the widget at that timestep. In line 9-10, we change the color of the widget to red at x and in line 11 we recompose the widget.

In general, we will allow pattern matching in eliminations and since widget identity indices can always be inferred, we will omit them. In this style, the above example become:

```
1 turnRedOnClick : ∀ (i : Id), Widget i  Widget i
2 turnRedOnClick w0 =
3 let (w1, c0) = onClick w1 in -- w1 : Widget i, c0 : -
                                                             I
4 let unpack (x , @ x ) = out c0 in -- x : Time
5 let (p, w2 @ x ) = split x w1 in -- p : Prefix i x, w2 : Widget i at time x
6 join x (p,(setColor (F Red) w2) @ x )
```
We will use the same sugared style throughout the rest of the examples.

The above example turns a widget red exactly at the time of the mouse click, but will not do anything with successive clicks. To also handle further mouse clicks, we must register an event handler *recursively*. This is a simple modification of the previous code:

```
1 keepTurningRed : ∀ (i : Id), Widget i  Widget i
2 keepTurningRed w0 =
3 let (w1, c0) = onClick w1 in -- w1 : Widget i, c0 : -
                                                             I
4 let unpack (x , @ x ) = out c0 in -- x : Time
5 let (p, w2 @ x ) = split x w1 in -- p : Prefix i x, w2 : Widget i at time x
6 join (p,(setColor (F Red) (keepTurningRed w2) @ x ))
```
By calling itself recursively, this function will make sure a widget will always turn red on a mouse click.

To understand the difference between two above examples, consider the code *turnBlueOnClick*(*keepTurningRed* w), where w is some widget. On the first click, the widget will turn blue, on the second click it will turn red and on any subsequent click, it will keep turning red, i.e., stay red unless further modified.

When working with widgets, we will often register multiple handlers on a single widget. For example, a widget should have one behavior for a click and another behavior for a key press. To choose between two events, we use the select construction. This construction is central to our language and how to think about a push-based reactive language.

Given two events, <sup>t</sup><sup>1</sup> : -A, t<sup>2</sup> : -B, there are three possible behaviors: Either t<sup>1</sup> returns first, and we wait for t<sup>2</sup> or t<sup>2</sup> returns first and we wait for t<sup>1</sup> or they return at the same time. In general, we want to select between n events, but if we need to handle all possible cases, this will give 2<sup>n</sup> cases, so to keep the syntax linear in size, we will omit the last case. In the case events *actually* return at the same time, we do a non-deterministic choice between them. The syntax for select is

> select (t<sup>1</sup> as x -→ t - <sup>1</sup> | t<sup>2</sup> as y -→ t - 2)

where x : A, y : B, t <sup>1</sup> : <sup>A</sup> -<sup>B</sup> -C and t <sup>2</sup> : <sup>B</sup> -<sup>A</sup> -C. The second important thing to understand when working with select is that given we are working with events, we do not actually know at which timestep the events will trigger, and hence, we do not know what the (linear) context contains. Thus, when using select, we will *only* know either <sup>a</sup> : A, t<sup>2</sup> : -<sup>B</sup> or <sup>t</sup><sup>1</sup> : -A, b : B. We can think of the select rule a *case-expression* that must respect time.

In the following example, we register two handlers, one for clicks and one for key presses, and change the color of the widget based on which returns first. We will only annotate the new parts.

```
1 widgetSelect : ∀ (i : Id), Widget i  Widget i
2 widgetSelect w0 =
3 let (w1, c) = onClick w0 in -- c : -
                                          I.
4 let (w2, k) = onKeypress w1 in -- k : -
                                          (F char).
5 let col = -- col : -
                                           (F Color)
6 select
7 ( c as x → let  = x in -- x : I, k : -
                                              (F Color).
8 let unpack (t, @ t)
9 = out (mapE (fun F ( ) → ) k) in
10 evt (F Red)
11 | k as y → let F k-
                      = y in -- y : F char, c : -
                                                  I
12 let unpack (t, @ t) = c in
13 evt (F Blue))
14 let unpack (x , col-

                 @ x ) = out col in -- col-
                                          : F Color at time x.
15 let (p, w3 @ x ) = split x w2 in
16 join (p,(setColor col-
                   w3) @ x )
```
In line 3 and 4, we register the two handlers. In line 5-13, we use the select construction. In the first case, the click happens first and we return the color red. In the second case, the key press happens first and we return the color blue. In both cases, because of the linear nature of the language, we need to discharge the unit and char, respectively, and the event that does not return first. In line 14, we turn the color event into an existential. In line 15, we use the timestep of the color event to split the widget, and in line 16, we change the color of the widget at that time and recompose it.

To see how λ<sup>W</sup>idget differs from more traditional synchronous FRP languages, we will examine how to encode a kind of streams. Since our language is *asynchronous*, the stream type must be encoded as

$$\mathsf{Str}\,A := \nu \alpha. \lozenge (A \circledast \alpha).$$

This asynchronous stream will *at some point in the future* give a head and a tail. We do not know when the first element of the stream will arrive, and after each element of the stream is produced, we will wait an indeterminate amount of time for the next element. The reason why the stream type in λ<sup>W</sup>idget must be like this is essentially that we want a *push-based* language, i.e., we do not want to wake up and check for new data in each program cycle. Instead, the program should sleep until new data arrives.

To show the difference between the asynchronous stream and the more traditional synchronous stream, we will look at some examples. With a traditional stream, a standard operation is zipping two streams: that is, given Str A and Str B, we can produce Str A × B, which should be the element-wise pairing of the two streams. It should be clear that this is not possible for our asynchronous

streams. Given two streams, we can wait until the first stream produces an element, but the second stream may only produce an element after a long period of time. Hence, we would need to buffer the first element, which is not supported in general. Remember, when using select, we can not use any already defined linear variables, since we do not know if they will be available in the future.

Rather than zipping stream, we can instead do a kind of *interleaving* as shown below. We use fold and unfold to denote the folding and unfolding of the fixpoint.

```
1 interleave : Str A  Str B  Str (A ⊕ B)
2 interleave xs ys = fold (
3 select
4 ( unfold xs as xs-
                       →
5 let (x , xs-
                   -
                    ) = xs-
                           in -- xs-
                                      : A ⊗ Str A, x : A, xs-
                                                          -
                                                           : Str A
6 evt (inl x , interleave xs-
                                -
                                 ys)
7 | unfold ys as ys-
                       →
8 let (y, ys-
                   -
                    ) = ys-
                           in -- ys-
                                      : B ⊗ Str B, y : B, ys-
                                                          -
                                                           : Str A
9 evt (inr y, interleave xs ys-
                                   -
                                    )))
```
Here, we use select to choose between which stream returns first, and then we let that element be the first element of the new stream.

On the other hand, some of the traditional FRP functions on streams can be translated. For instance, we can map of function over a stream, given that *it is available at each step in time*:

 *map* : F (G (*A B*)) Str *A* Str *B map* f<sup>0</sup> *xs* = **let** F f<sup>1</sup> = f<sup>0</sup> **in** -- f<sup>1</sup> : G(A B) **let** (*y*,(*x* , *xs* ) @ *<sup>y</sup>*) = -- <sup>y</sup> : Time, <sup>x</sup> : A, xs : -Str A at time y out (unfold *xs*) **in** fold (evt ((runG f1) *x* , *map* f<sup>0</sup> *xs* ))

The type F(G(A B)) is read as a linear function with no free variables that can be used in a non-linear fashion, i.e., duplicated. This restriction to such "globally available functions" is reminiscent of the "box" modality in Bahr et al. [1] and Krishnaswami [20], and the F and G construction can be understood as decomposing the box modality into two separate steps. This relationship will be made precise in the logical interpretation of λ<sup>W</sup>idget in section 1

As a final example, we will show how to dynamically update the GUI, i.e., how to add new widgets on the fly. Before we can give the example, we need to extend our widget API, to allow composition of widgets. To that end, we add the vAttach command to our API.

$$\mathsf{vAttacht} : \forall (i, j : \mathsf{Id}), \mathsf{Widget} \, i \dashrightarrow \mathsf{Widget} \, j \dashrightarrow \mathsf{Widget} \, i.$$

This command should be understood as an abstract version the div tag in HTML. In the following example, we think of the widget as a simple button that when clicked, will create a new button. When *any* of the buttons gets clicked, a new button gets attached.

```
1 buttonStack : ∀ i, Widget i  Widget i
2 buttonStack w0 =
3 let (w1, c) = onClick w0 in
4 let (x , @ x ) = out e in
5 let (p, w2 @ x ) = split x w1 in
6 let w3 = (let (y, w) = newWidget  in
7 vAttach w2 (buttonStack w)) @ x in
8 join (p, w3)
```
The important step here is in line 6 and 7. Here the new button is attached at the time of the mouse click, and buttonStack is called recursively on the newly created button.

#### **Formal Calculus**

This sections gives the rules, meta-theory and logical interpretation of λ<sup>W</sup>idget. Briefly, the language is a mixed linear-non-linear adjoint calculus in the style of Benton–Wadler [4,3]. The non-linear fragment, also called Cartesian in the following, is a minimal simply typed lambda calculus whereas the linear fragment contains several non-standard judgments used for widget programming.

*Contexts and Typing Judgments* We have three typing judgments: one for indices, one for Cartesian (non-linear) terms, and one for linear terms. These are distinguished by a subscript on the turnstile, i for indices, c for Cartesian terms and l for linear terms. These depend on different contexts. The index judgment depends only on a index context, whereas the Cartesian and linear judgments depends on both an index and a linear and/or a Cartesian context. The rules for context formation is given in Figure 1. These are mostly standard except for the dependence on a previously defined context and the fact that the linear context contains variables of the form a :<sup>τ</sup> A, i.e., temporal variables. The judgment a :<sup>τ</sup> A is read as "a has the type A at the timestep τ". In the linear setting we will write a : A instead of a :<sup>0</sup> A, i.e., a judgment in the current timestep.


**Fig. 1.** Context Formation

The index judgment describes how to introduce indices. The typing rules are given in Figure 2. The judgment Θ <sup>i</sup> τ : σ contains a single context, Θ, for index variables. There are only two sorts of indices, identifiers and timesteps.

#### **Index Judgments:**

$$\begin{array}{ll} \begin{array}{l} \tau \in \mathsf{Time} \\ \Theta \vdash\_{i} \tau : \mathsf{Time} \end{array} \begin{array}{l} \mathsf{Time} \\ \end{array} \end{array} \begin{array}{l} \begin{array}{l} \iota \in \mathsf{ld} \\ \end{array} \begin{array}{l} \mathsf{ID} \\ \end{array} \begin{array}{l} \begin{array}{l} i : \sigma \in \Theta \\ \Theta \vdash\_{i} i : \sigma \end{array} \end{array} \begin{array}{l} \mathsf{Value} \\ \end{array} \end{array} \begin{array}{l} \begin{array}{l} i : \sigma \in \Theta \\ \Theta \vdash\_{i} i : \sigma \end{array} \begin{array}{l} \mathsf{Value} \\ \end{array} \end{array}$$

**Fig. 2.** Index Typing rules

The Cartesian judgment describes the Cartesian, or non-linear, fragment. This is a minimal simply typed lambda calculus with the addition of the G type, used for moving between the linear and Cartesian fragment, and explained further below. The judgment Θ; Γ <sup>c</sup> t : A has two contexts; Θ for indices and Γ for Cartesian variables.

The linear fragment is most of the language, and a selection of typing rules is given in Figure 3. The judgment is done w.r.t three contexts, Θ for index variables, Γ for Cartesian variables and Δ for linear variables. Many of the rules are standard for a linear calculus, except for the presence of the additional contexts. We will not describe the standard rules any further.

The first non-standard rule is for -. The introduction and elimination rules follow from the fact that is a non-strong monad. More interesting is the select rule. Here we see the formal rule corresponding to the informal explanation in section 1. The important thing here is that we can not use any previously defined linear variable when typing t <sup>1</sup> and t <sup>2</sup>, since we do not actually know *when* the typing happens. Note, we can see the select rule as a binary version of the - let-binding. This could be extended to an n-ary version, but we do not do this in our core calculus. The rules for A @ τ shows how to move between the judgment t : A @ τ and t :<sup>τ</sup> A. That is, moving from knowing in the current timestep that t will have the type A at time τ and knowing at time τ that t has type A. The (F -I), (F -E), (G -I) and (G -E) rules show the adjoint structure of the language. The (G -I) rule takes a closed linear term of type A and gives it the Cartesian type G A. Note, because it has no free linear variables, it is safe to duplicate. The (G -E) rule lets us get an A without needing any linear resources. Conversely, the (F -I) rule embeds a intuitionistic term into the linear fragment and the (F -E) rule binds an intuitionistic variable to let us freely use the value. The (Delay) rule shows what happens when we actually *know* the timestep. The important part is <sup>Δ</sup> <sup>=</sup> <sup>Δ</sup> <sup>↓</sup><sup>τ</sup> which means two things. One, all the variables in <sup>Δ</sup> are on the form a :<sup>τ</sup> A, i.e., judgments at time τ and two, we shift Δ into the future such that all the variables of Δ is of the form a : A. The way to understand this is, if all the variables in Δ are typed at time τ and the conclusion is at time τ , it is enough to "move to" time τ and then type w.r.t that timestep. Finally, we have (I<sup>τ</sup> -E) and (⊗<sup>τ</sup> -E). These allow us to work with linear unit and products at time τ . These are added explicitly since they can not be derived by the other rules, and are needed for typing certain kinds of programs.

*Unfolding Events to Exists* The type system as given above contains both -A and A @ k, as two distinct ways to handle time. The former means that

<sup>Θ</sup> <sup>i</sup> <sup>τ</sup> : Time <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup> l <sup>t</sup> :<sup>τ</sup> <sup>A</sup> <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup> l <sup>t</sup> @ <sup>τ</sup> : <sup>A</sup> @ <sup>τ</sup> (@-I) <sup>Θ</sup> <sup>i</sup> <sup>t</sup> : Time <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup><sup>1</sup> l <sup>t</sup><sup>1</sup> : <sup>A</sup> @ τ Θ; <sup>Γ</sup>; <sup>Δ</sup>2, a :<sup>τ</sup> <sup>A</sup> l <sup>t</sup><sup>2</sup> : <sup>B</sup> <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup>1, Δ<sup>2</sup> l let <sup>a</sup> @ <sup>τ</sup> <sup>=</sup> <sup>t</sup><sup>1</sup> in <sup>t</sup><sup>2</sup> : <sup>B</sup> (@-E) <sup>Θ</sup>; <sup>Γ</sup> c <sup>e</sup> : <sup>G</sup> <sup>A</sup> <sup>Θ</sup>; <sup>Γ</sup>; · l runG <sup>e</sup> : <sup>A</sup> (G-E) <sup>Θ</sup>; <sup>Γ</sup> <sup>c</sup> <sup>e</sup> : <sup>X</sup> <sup>Θ</sup>; <sup>Γ</sup>; · l <sup>F</sup> <sup>e</sup> : <sup>F</sup> <sup>x</sup> (F-I) <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup><sup>1</sup> l <sup>t</sup><sup>1</sup> : <sup>F</sup> X Θ; Γ, x : <sup>X</sup>; <sup>Δ</sup><sup>2</sup> l <sup>t</sup><sup>2</sup> : <sup>B</sup> <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup>1, Δ<sup>2</sup> l let <sup>F</sup> <sup>x</sup> <sup>=</sup> <sup>t</sup><sup>1</sup> in <sup>t</sup><sup>2</sup> : <sup>B</sup> (F-E) Θ, i : <sup>σ</sup>; <sup>Γ</sup>; <sup>Δ</sup> l <sup>t</sup> : <sup>A</sup> <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup> l <sup>Λ</sup>(<sup>i</sup> : <sup>σ</sup>).t : <sup>∀</sup>(<sup>i</sup> : <sup>σ</sup>).A (∀-I) <sup>Θ</sup> <sup>i</sup> <sup>s</sup> : σ Θ; <sup>Γ</sup>; <sup>Δ</sup> <sup>l</sup> <sup>t</sup> : <sup>∀</sup>(<sup>i</sup> : <sup>σ</sup>).A <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup> l <sup>t</sup><sup>s</sup> : {s/i}<sup>A</sup> (∀-E) <sup>Θ</sup> <sup>i</sup> <sup>s</sup> : σ Θ; <sup>Γ</sup>; <sup>Δ</sup> l <sup>t</sup> : {s/i}<sup>A</sup> <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup> l {s, t} : <sup>∃</sup>(<sup>i</sup> : <sup>σ</sup>).A (∃-I) <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup><sup>1</sup> l <sup>t</sup><sup>1</sup> : <sup>∃</sup>(<sup>i</sup> : <sup>σ</sup>).A Θ, s : <sup>σ</sup>; <sup>Γ</sup>; <sup>Δ</sup>2, a : {s/i}<sup>A</sup> l <sup>t</sup><sup>2</sup> : <sup>B</sup> <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup>1, Δ<sup>2</sup> l let unpack {s, a} <sup>=</sup> <sup>t</sup><sup>1</sup> in <sup>t</sup><sup>2</sup> : <sup>B</sup> (∃-E) <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup><sup>1</sup> l <sup>t</sup><sup>1</sup> : -A Θ; <sup>Γ</sup>; <sup>Δ</sup><sup>2</sup> l <sup>t</sup><sup>2</sup> : -B <sup>Θ</sup>; <sup>Γ</sup>; <sup>a</sup> : A, t<sup>2</sup> : -<sup>B</sup> l <sup>t</sup> - <sup>1</sup> : -C Θ; <sup>Γ</sup>; <sup>b</sup> : B, t<sup>1</sup> : -<sup>A</sup> l <sup>t</sup> - <sup>2</sup> : -C <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup>1, Δ<sup>2</sup> l select (t<sup>1</sup> as <sup>a</sup> -→ t - <sup>1</sup> | t<sup>2</sup> as b -→ t - <sup>2</sup>) : -<sup>C</sup> (select) <sup>Θ</sup> i <sup>τ</sup> : Time <sup>Δ</sup>- = Δ ↓<sup>τ</sup> Θ; Γ; Δ- l <sup>t</sup> : <sup>A</sup> <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup> l <sup>t</sup> :<sup>τ</sup> <sup>A</sup> (delay) <sup>Θ</sup> <sup>i</sup> <sup>τ</sup> : Time <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup><sup>1</sup> l <sup>t</sup><sup>1</sup> :<sup>τ</sup> <sup>I</sup> <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup><sup>2</sup> l <sup>t</sup><sup>2</sup> : <sup>B</sup> <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup>1, Δ<sup>2</sup> l let @ <sup>τ</sup> <sup>=</sup> <sup>t</sup><sup>1</sup> in <sup>t</sup><sup>2</sup> : <sup>B</sup> (I<sup>τ</sup> -E) <sup>Θ</sup> i <sup>τ</sup> : Time <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup><sup>1</sup> l <sup>t</sup><sup>1</sup> :<sup>τ</sup> <sup>A</sup> <sup>⊗</sup> B Θ; <sup>Γ</sup>; <sup>Δ</sup>2, a :<sup>τ</sup> A, b :<sup>τ</sup> <sup>B</sup> l <sup>t</sup><sup>2</sup> : <sup>C</sup> <sup>Θ</sup>; <sup>Γ</sup>; <sup>Δ</sup>1, Δ<sup>2</sup> l let (a, b) @ <sup>τ</sup> <sup>=</sup> <sup>t</sup><sup>1</sup> in <sup>t</sup><sup>2</sup> : <sup>C</sup> (⊗<sup>τ</sup> -E)

**Fig. 3.** Selected Linear Typing rules

something of type A will arrive at *some* point in the future, whereas the latter means an <sup>A</sup> arrives at a *specific* point in the future. The strength of is that is gives easy and concise typing rules, whereas the strength of A @ k is that it allows for a more precise usage of time. To connect these two, we add the linear isomorphism -A ∼= ∃k.A @ k to our language, which is witnessed by out and into, as part of the widget API. This isomorphism is true semantically, but can not be derived in the type system. In particular, this isomorphism allows the select rule to be given with -, while still allowing the use timesteps when working with the resulting event. If we were to give the equivalent definition using timesteps, one would need to have some sort of *constraint system* for deciding which events happens first. Avoiding such constraints also allows for a simpler implementation, as everything is our type system can be inferred.

*Meta-theory of Substitution* The meta-theory of λ<sup>W</sup>idget is given in the form of a series of substitution lemmas. Since we have three different contexts, we will end up with six different substitutions into terms. The Cartesian to Cartesian, Cartesian to linear and linear to linear are the usual notion of mutual recursive substitution. More interesting is the substitution of indices into Cartesian and linear terms and types. We prove the following lemma, showing that typing is preserved under index substitution:

#### **Lemma 1 (Preservation of Typing under Index Substitution).**

$$\begin{array}{ccc} \zeta : \Theta' \to \Theta & \Theta ; \Gamma \vdash\_c e : X\\ \hline \Theta' ; \zeta(\varGamma) \vdash\_c \zeta(e) : \zeta(X) \end{array} \qquad \qquad \begin{array}{ccc} \zeta : \Theta' \to \Theta & \Theta ; \Gamma ; \Delta \vdash\_l t : \_\tau A\\ \hline \Theta' ; \zeta(\varGamma) ; \zeta(\varDelta) \vdash\_l \zeta(t) : \_\tau \zeta(A) \end{array}$$

Both are these (and all other cases for substitution) are proved by a lengthy but standard induction over the typing tree. See the technical appendix for full proofs of all six substitution lemmas.

*Logical Interpretation* Our language has a straightforward logical interpretation. The logic corresponding to the Cartesian fragment is a propositional intuitionistic logic, following the usual Curry–Howard interpretation. The logic corresponding to the substructural part of the language is a linear, linear temporal logic. The single-use condition on variables means that the syntax and typing rules correspond to the rules of intuitionistic linear logic (i.e., the first occurrence of linear in "linear, linear temporal"). However, we do not have a comonadic exponential modality !A as a primitive. Instead, we follow the Benton–Wadler approach [4,3] and decompose the exponential into the composition of a pair of adjoint functors mediating between the Cartesian and linear logic.

In addition to the Benton–Wadler rules, we have a temporal modality -A, which corresponds to the eventually modality of linear temporal logic (i.e., the second occurrence of "linear" in "linear, linear temporal logic"). This connective is usually written F A in temporal logic, but that collides with the F modality of the Benton–Wadler calculus. Therefore we write it as -A to reflect its nature as a possibility modality (or monad). In our calculus, the axioms of S4.3 are derivable:

$$\begin{aligned} (T) &: A \multimap \Diamond A \\ (4) &: \Diamond \Diamond A \multimap \Diamond A \\ (3) &: \Diamond (A \otimes B) \multimap \Diamond ((\Diamond A \otimes B) \oplus \Diamond (A \otimes \Diamond B) \oplus \Diamond (A \otimes B)) \end{aligned}$$

Since the ambient logic is linear, intuitionistic implication X → Y is replaced with the linear implication <sup>A</sup> <sup>B</sup>, and intuitionistic conjunction <sup>X</sup> <sup>∧</sup> <sup>Y</sup> is replaced with the linear tensor product A⊗B. It is easy to see that the first two axiom corresponds to the monadic structure of -, and the .3 axiom corresponds to the select rule (with our syntax for select corresponding to immediately waiting for and then pattern-matching on the sum type). In the literature, the .3 axiom is often written in terms of the box modality <sup>A</sup> [8], but we present it here in a (classically) equivalent formulation mentioning the eventually modality -A. We do not need to an explicit box modality A, since the decomposition of the exponential F(GA) from the linear-non-linear calculus serves that role.

In our system, *we do not offer* the next-step operator A. Since we model asynchronous programs, we do not let programmers write programs which wake up in a specified amount of time. We only offer an iterated version of this connective, <sup>A</sup> @ <sup>n</sup>, which can be interpreted as <sup>n</sup>A, and our term syntax has no numeric constants which can be used to demand a specific delay.

Finally, the universal and existential quantifiers (in both the intuitionistic and linear fragments) are the usual quantifier rules for first-order logic.

#### **Semantics**

In this section we give a denotational model for λ<sup>W</sup>idget. It is a linear-non-linear (LNL) hyperdoctrine [24,16] with the non-linear part being Set and the linear part being the category of internal relations over a suitable "reactive" category. The hyperdoctrine structure is used to interpret the quantification over indices. This model is nearly entirely standard: the most interesting thing is the reactive base category and the interpretation of widgets. It is well known that any symmetric monoidal closed category (SMCC) models multiplicative intuitionistic linear logic (MILL), and it is similarly well known that the category of relations over Set can be give the structure of a SMCC by using the Cartesian product as both the monoidal product and monoidal exponential. This construction lift directly to any category of internal relations over a category that is suitably "Set-like", i.e., a topos. Our base category is a simple presheaf category, and hence, we use this construction to model the linear fragment of λ<sup>W</sup>idget.

*The Base Reactive Category* The base reactive category is where the notion of time will arise and is it this notion that will be lifted all the way up to the LNL hyperdoctrine. The simplest model of "time" is Set<sup>N</sup>, which can be understood as "sets through time" [23]. This can indeed by used as a model for a reactive setting, but for our purposes it is too simple, and further, depending on which ordering is considered for N, may have undesirable properties for the reactive setting. Instead, we use the only slightly more complicated Set<sup>N</sup>+1, henceforth denoted <sup>R</sup>, where the ordering on <sup>N</sup> + 1 is the discrete ordering on <sup>N</sup> and 1 is related to everything else. Adding this "point at infinity" allows global reasoning about objects, an intuition that is further supported by the definition of the subobject classifier below. Further, this model is known to be able to differentiate between least and greatest fixpoints [15], and even though we do not use this for λ<sup>W</sup>idget, we consider it a useful property for further work (see section 1). Objects in R can be visualized as

We can think of A<sup>∞</sup> as the global view of the object and A<sup>n</sup> as the local view of the object at each timestep. Morphisms are natural transformations between such diagrams and the naturality condition means that having a map from A<sup>∞</sup> to B<sup>∞</sup> must also come with coherent maps at each timestep.

In R we define two endofunctors, which can be seen as describing the passage of time:

**Definition 1.** *We define the* later *and* previous *endofunctors on* <sup>R</sup>*, denoted and , respectively:*

$$(\rhd A)\_n := \begin{cases} 1 & n = 0 \\ A\_{n'} & n = n' + 1 \\ A\_{\infty} & n = \infty \end{cases} \qquad\qquad (\lhd A)\_n := \begin{cases} A\_{n+1} & n \neq \infty \\ A\_{\infty} & n = \infty \end{cases}$$

Note that when we apply the later functor, the global view does not change, but the local views are shifted forward in time.

**Theorem 1.** *The later and previous endofunctors form an adjunction.*

**Definition 2.** *The sub-object classifier, denoted* Ω*, in* R *is the object*

$$\mathcal{Q}\_{\infty} = \mathcal{P}(\mathbb{N}) + 1 \qquad \qquad \qquad \mathcal{Q}\_n = \{0, 1\}$$

For each <sup>n</sup> <sup>∈</sup> <sup>N</sup>, <sup>Ω</sup><sup>n</sup> denotes whether a given proposition is true at the <sup>n</sup>th timestep. Ω<sup>∞</sup> gives the "global truth" of a given proposition. The left injection is some subset of N that denotes at which points in time something is true. The right injection denotes that something is true "at the limit", and in particular, also at all timesteps. Note, a proposition can be true at all timesteps but not at the limit. This extra point at infinity is precisely what allows us the differentiate between least and greatest fixpoints.

*The Category of Internal Relations* To interpret the linear fragment of the language, we will use the category of internal relations on R. Given two objects A and B in R, an *internal relation* is a sub-object of the product A×B. This can equivalently by understood as a map A × B → Ω. The category of internal relations in the category where the objects are the objects of R and the morphisms A → B are internal relations A × B → Ω in R. We denote the category of internal relations as RelR.

**Theorem 2.** *Using* <sup>A</sup> <sup>⊗</sup> <sup>B</sup> <sup>=</sup> <sup>A</sup> <sup>×</sup> <sup>B</sup> *and* <sup>A</sup> <sup>B</sup> <sup>=</sup> <sup>A</sup> <sup>×</sup> <sup>B</sup> *as monoidal product and exponential, respectively,* Rel<sup>R</sup> *is a symmetric monoidal closed category.*

**Theorem 3.** *There is an adjunction in* Rel<sup>R</sup> *where and are the lifting of the previous and later functors from* R *to* RelR*.*

**Definition 3.** *We define the* iterated later modality *or the "at" connective as a successive application of the later modality.*

$$\rhd^0 A = A$$

$$\rhd^{(k+1)}A = \rhd^{\flat}(\rhd^k A)$$

*and we will alternatively write* <sup>A</sup> @ <sup>k</sup> *to mean* <sup>k</sup>A*.*

**Definition 4.** *We define the* event *functor on* Rel<sup>R</sup> *as an iterated later.*

$$\begin{aligned} \lozenge A: \mathsf{Rel}\_{\mathbb{R}} &\to \mathsf{Rel}\_{\mathbb{R}} \\ (\lozenge A)\_{\infty} &= A\_{\infty} \\ (\lozenge A)\_{n} &= \Sigma(k:\mathbb{N}). (\rhd{^kA})\_{n} \end{aligned}$$

The event functor additionally carries a monadic structure (see [29] and the technical appendix).

**Theorem 4.** *We have the isomorphism* -A ∼= Σ(n : N).A @ n *for any* A

**Theorem 5.** *We have the following adjunctions between* Set*,* R *and* RelR*:*

*where* Δ *is the constant functor,* lim *is the limit functor,* I *is the inclusion functor and* P *is the image functor. This induces an adjunction between* Set *and* RelR*.*

*The Widget Object* One of the most important objects in Rel<sup>R</sup> is the *widget* object. This object is used to interpret widgets and prefixes. The widget object will be defined with respect to an ambient notion of identifiers, which we will denote Id. These will be part of the hyperdoctrine structure define below, and for now, we will just assume such an object to exists. We will also use a notion of timesteps internal to the widget object. Note that this timestep is different from the abstract timestep used for defining RelR, but are related as defined below. We denote the abstract timesteps with Time.

Before we can define the widget object, we need to define an appropriate object of commands. In our minimal Widget API, the only *semantic* commands will be setColor, onClick and onKeypress. The rest of the API is defined as morphisms on the widget object itself. To work with the semantics commands, we additionally need a *compatibility* relation. This relation describes what commands can be applied at the same time. In our setting this relation is minimal, but can in principle be used to encode whatever restrictions is needed for a given API.

**Definition 5.** *We define the command object as*

Cmd = {(setColor, color), onClick, onKeypress}

*where color is an element of a "color" object. The compatibility relations are:*

(op, arg) (op- , arg- ) iff (op = op- ⇒ arg = arg- )

The only non-compatible combination of commands is two application of the setColor command, the idea being that you can not set the color twice in the same timestep.

We can now define the widget and prefix objects

**Definition 6.** *The widget object, denoted* Widget*, is indexed by* i ∈ Id *and is defined as*

$$\begin{aligned} \mathsf{Widget}\_{\infty} & i = \left\{ (w, i) \mid w \in \mathcal{P}(\mathsf{Time} \times \mathsf{Cmd}), (t, c) \in w \land (t, c') \in w \to c \bowtie c' \right\}, \\ \mathsf{Widget}\_{n} & i = \left\{ (w, i) \in \mathsf{Widget}\_{\infty} \; i \; | \; \forall (t, c) \in w, t \leqslant n \right\} \end{aligned}$$

*The prefix object, denoted* Prefix*, is indexed by* i ∈ Id *and* t ∈ Time *and is:*

$$\begin{aligned} \mathtt{Prefix}\_{\infty} \ i \ t &= \left\{ (P, i) \subset \mathsf{Widget}\_{\infty} \ i \mid \forall (t', c) \in P, t' \leqslant t \right\} \\ \mathtt{Prefix}\_{n} \ i \ t &= \left\{ \left\{ (P, i) \subset \mathsf{Prefix}\_{\infty} \ i \; t \mid \forall (t', c) \in P, t' \leqslant n \right\} \quad n < t \\ \mid & & \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \end{aligned}$$

The widget object is a collection of times and commands keeping track of what has happened to it at various times – imagine a *logbook* with entries for each time step. At the point at infinity, the "global" behavior of the widget is defined, i.e., the full logbook of the widget. For each n, Widget<sup>n</sup> is simply what has happened to the widget so far, i.e., a truncated logbook. The prefix object is a widget object that is only defined up to some timestep, and is the unit after that. This yields a semantic difference between the widget where the color is set only once, and the widget where the color is set at every timestep. This reflects a real difference in actual widget behavior: if *turnRedOnClick* w later set to be blue, it will remain blue, but *keepTurningRed* w will turn back to being red.

To manipulate widgets we define two "restriction" maps.

**Definition 7.** *We define the following on widgets and prefixes*

$$\begin{aligned} \text{shift } t: \mathsf{Midget } i &\to\_{\text{Rel}\_{\mathcal{R}}} \mathsf{Midget } i\\ (\mathsf{shift } t \, W)\_n &= \left\{ (t' - t, c) \mid (t', c) \in W \wedge t \leqslant t' \right\} \end{aligned}$$

$$\begin{aligned} \mathsf{prefix} \ltimes t \,\, i: \mathsf{Widget} \,\, i &\to\_{\mathsf{Ral}\_{\mathcal{R}}} \mathsf{Prefix} \,\, i \,\, t \\ (\mathsf{prefix} \ltimes t \,\, i \,\, W)\_n &= \begin{cases} \{(t', c) \in W \mid t' < t\} & n < t \\ \mathsf{l} & n \geqslant t \end{cases} \end{aligned}$$

The intuition behind these is that prefix t i "cuts off" the widget after t, giving a prefix, whereas shift t shifts forward all entries in the widget by t.

Using the above, we can now define the split and join morphisms. These are again given w.r.t ambient Id and Time objects, which will be part of the full hyperdoctrine structure:

**Definition 8.** *We define the following morphisms on the widget object*

split i t : Widget i →<sup>R</sup>el<sup>R</sup> Prefix i t ⊗ Widget i @ t (split itw)n = (prefix t i w,shift t w)n

join i t : Prefix i t ⊗ Widget i @ t →<sup>R</sup>el<sup>R</sup> Widget i

(join i t (p, w))n <sup>=</sup> - <sup>p</sup>n n<t <sup>w</sup>n−t <sup>n</sup> <sup>t</sup>

*Linear-non-linear Hyperdoctrine* So far we have not explained in details how to model the quantifiers in our system. To do this, we use the notion of a *hyperdoctrine* [22]. For first-order logic, this is a functor from a category of contexts and substitutions to the category of Cartesian closed categories, with the idea that we have one CCC for each valuation of the free first-order variables.

As our category of contexts, we use a Cartesian category to interpret our index objects, Time and Id. The former is interpreted as N + 1 and the latter as <sup>N</sup>. In our case, both Set and Rel<sup>R</sup> are themselves hyperdoctrines w.r.t to this category of contexts, the former a first-order hyperdoctrine and the latter a multiplicative intuitionistic linear logic (MILL) hyperdoctrine. Together these form a linear-non-linear hyperdoctrine through the adjunction given in Theorem 5.

**Definition 9.** *A linear-non-linear hyperdoctrine is a MILL hyperdoctrine* L *together with a first-order hyperdoctrine* C *and a fiber-wise monoidal adjunction* F : L C : G*.*

**Theorem 6.** *The categories* Set *and* Rel<sup>R</sup> *form a linear-non-linear hyperdoctrine w.r.t the interpretation of the indices objects, with the adjunction given as in Theorem 5.*

We refer the reader to the accompanying technical appendix for the full details.

*Denotational Semantics* We the above, we have enough structure to give an interpretation of λ<sup>W</sup>idget. Again, most of this interpretation is standard in the use of the hyperdoctrine structure, and we interpret in the obvious way using the linear hyperdoctrine structure on RelR. As an example, we sketch the interpretation of the widget object and the setColor command below.

**Definition 10.** *We interpret the* Widget i *and* Prefix i *types using the widget and prefix objects:*

> <sup>Θ</sup> Widget <sup>i</sup> <sup>=</sup> Widget <sup>Θ</sup> s <sup>i</sup> : Id <sup>Θ</sup> Prefix i t <sup>=</sup> Prefix <sup>Θ</sup> s <sup>i</sup> : Id <sup>Θ</sup> s <sup>t</sup> : Time

*and we interpret the* setColor *commands as:*

setColor : <sup>∀</sup>(<sup>i</sup> : Id), Widget <sup>i</sup> <sup>⊗</sup> <sup>F</sup> Color Widget <sup>i</sup> <sup>=</sup> {<sup>w</sup> <sup>∪</sup>W {(0,(setColor, col))} | <sup>w</sup> <sup>∈</sup> Widget <sup>i</sup>, col <sup>∈</sup> Color}

*where* ∪<sup>W</sup> *is a "widget union", which is a union of sets such that identifiers indices and compatibility of commands are respected*

This interpretation shows that a widget is indeed a logbook of events. Using the setColor command simply adds an entry to the logbook of the widget. Note we only set the color in the current timestep. To set the color in the future, we combine the above with appropriate uses of splits and joins. The interpretation of split and join are done using their semantic counterparts, and the interpretation of onClick and onKeypress are done, using our non-deterministic semantics, by associating a widget with *all possible occurrences* of the corresponds event.

*Soundness of Substitution* Finally, we prove that semantic substitution is sound w.r.t syntactic substitution. As with the proofs of type preservation for syntactic substitution, there are several cases for the different kinds of substitution, but the main results is again concerned with substitution of indices:

**Theorem 7.** *Given* ζ : Θ → Θ*,* Θ; Γ <sup>c</sup> e : X *and* Θ; Γ; Δ <sup>l</sup> t : A *then*

$$\begin{aligned} [\zeta] \ [\Theta; I \vdash\_c e : X] &= [\Theta'; \zeta(I) \vdash\_c \zeta(e) : \zeta(X)] \\ [\zeta] \ [\Theta; I \vdash\_l t : A] &= [\Theta'; \zeta(I) ; \zeta(\Delta) \vdash\_l \zeta(t) : \zeta(A)] \end{aligned}$$

Proofs for all six substitutions lemmas can be found in the technical appendix.

#### **Related and Future Work**

Much work has aimed at a logical perspective on FRP via the Curry–Howard correspondence [21,18,17,19,20,10,1]. As mentioned earlier, most of this work has focused on calculi that have a Nakano-style later modality [25], but this has the consequence that it makes it easy to write programs which wake up on every clock tick. In this paper, we remove the explicit next-step modality from the calculus, which opens the door to a more efficient implementation style based on the socalled "push" (or event/notification-based) implementation style. Elliott [12] also looked at implementing a push-based model, but viewed it as an optimization rather than a first-class feature in its own right. In future work, we plan on implementing a language based upon this calculus, with the idea that we can compile to Javascript, and represent widgets with DOM nodes, and represent the -A and A @ n temporal connectives using doubly-negated callback types (in Haskell notation, Event A = (A -> IO ()) -> IO ()). This should let us write GUI programs in functional style, while generating imperative, callback-based code in the same style that a handwritten GUI program would use.

Our model, in terms of Set<sup>N</sup>+1, enriches LTL's semantics from time-indexed truth-values to time-indexed sets. The addition of the global view or point at infinity enables our model to distinguishes between least and greatest fixed points [15] (i.e., inductive and coinductive types), unlike in models of guarded recursion where guarded types are bilimit-compact [6]. This lets us encode temporal liveness and safety properties using inductive and coinductive types [10,2].

A recent development for comonadic modalities is the introduction of the so-called 'Fitch-style' calculi [7,11] as an alternative to the Pfenning–Davies pattern-style elimination [26]. These calculi have been used successfully for FRP [1], and one interesting question is whether they extend to adjoint calculi as well – i.e., can the F (X) modality support a direct-style eliminator?

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **On the Expressiveness of B¨uchi Arithmetic**

Christoph Haase<sup>1</sup> (-) and Jakub R´o˙zycki<sup>2</sup>

<sup>1</sup> Department of Computer Science, University of Oxford, Oxford, UK christoph.haase@cs.ox.ac.uk

<sup>2</sup> Institute of Mathematics, University of Warsaw, Warsaw, Poland

**Abstract.** We show that the existential fragment of B¨uchi arithmetic is strictly less expressive than full B¨uchi arithmetic of any base, and moreover establish that its Σ2-fragment is already expressively complete. Furthermore, we show that regular languages of polynomial growth are definable in the existential fragment of B¨uchi arithmetic.

**Keywords:** logical theories · logical definability · quantifier elimination · automatic structures · regular languages

### **1 Introduction**

This paper studies the expressive power of B¨uchi arithmetic, an extension of Presburger arithmetic, the first-order theory of the structure <sup>N</sup>, <sup>0</sup>, <sup>1</sup>, <sup>+</sup>. B¨uchi arithmetic additionally allows for expressing restricted divisibility properties while retaining decidability. Given an integer p ≥ 2, *B¨uchi arithmetic of base* p is the first-order theory of the structure <sup>N</sup>, <sup>0</sup>, <sup>1</sup>, <sup>+</sup>, V<sup>p</sup>, where <sup>V</sup><sup>p</sup> is a binary predicate such that Vp(a, b) holds if and only if a is the largest power of p dividing b without remainder, i.e., <sup>a</sup> <sup>=</sup> <sup>p</sup><sup>k</sup>, <sup>a</sup> <sup>|</sup> <sup>b</sup> and <sup>p</sup> · <sup>a</sup> <sup>b</sup>.

Presburger arithmetic admits quantifier-elimination in the extended structure <sup>N</sup>, <sup>0</sup>, <sup>1</sup>, <sup>+</sup>, {c|·}c><sup>1</sup> additionally consisting of unary divisibility predicates c|· for every c > 1 [10]. It follows that the existential fragment of Presburger arithmetic is expressively complete, since any predicate c|· can be expressed using an additional existentially quantified variable. We study the analogous question for B¨uchi arithmetic and show, as the main result of this paper, that its existential fragment is, in any base, strictly less expressive than full B¨uchi arithmetic. Notably, this result implies that there does not exist a quantifierelimination result *`a la* Presburger for B¨uchi arithmetic, i.e., any extension of B¨uchi arithmetic with additional predicates definable in existential B¨uchi arithmetic does not admit quantifier elimination.

A central result about B¨uchi arithmetic is that it is an automatic structure: a set <sup>M</sup> <sup>⊆</sup> <sup>N</sup><sup>n</sup> is definable in B¨uchi arithmetic of base <sup>p</sup> if and only if <sup>M</sup> is recognizable by a finite-state automaton under a base p encoding of the natural

 Parts of this research were carried out while the first author was affiliated with the Department of Computer Science, University College London, UK.

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 310–323, 2021. https://doi.org/10.1007/978-3-030-71995-1 16

numbers. Equivalently, M is p*-regular*. This result was first stated by B¨uchi [4], albeit in an incorrect form, and later correctly stated and proved by Bruy`ere [2], see also [3]. Villemaire showed that the Σ3-fragment of B¨uchi arithmetic is expressively complete [13, Cor. 2.4]. He established this result by showing how to construct a Σ3-formula defining the language of a given finite-state automaton. We observe that Villemaire's construction can actually be improved to a Σ2 formula and thus obtain a full characterization of the expressive power of B¨uchi arithmetic in terms of the number of quantifier alternations.

Our approach to separating the expressiveness of existential B¨uchi arithmetic from full B¨uchi arithmetic in base p is based on a counting argument. Given a set <sup>M</sup> <sup>⊆</sup> <sup>N</sup>, define the counting function <sup>d</sup>M(n) := #(<sup>M</sup> ∩ {p<sup>n</sup>−<sup>1</sup>,...,p<sup>n</sup> <sup>−</sup> <sup>1</sup>}) which counts the numbers of bit-length n in base p in M. If M is definable in existential B¨uchi arithmetic of base p, we show that d<sup>M</sup> is either O(n<sup>c</sup>) for some <sup>c</sup> <sup>≥</sup> 0, or at least <sup>c</sup> · <sup>p</sup><sup>n</sup> for some constant c > 0 and infinitely many <sup>n</sup> <sup>∈</sup> <sup>N</sup>. Since, for instance, for <sup>M</sup><sup>p</sup> <sup>⊆</sup> <sup>N</sup> defined as the set of numbers with <sup>p</sup>-ary expansion in the regular language {10, <sup>01</sup>}<sup>∗</sup>, we have <sup>d</sup><sup>M</sup><sup>p</sup> (n) = <sup>Θ</sup>(2n/<sup>2</sup>), and hence <sup>M</sup><sup>p</sup> is not definable in existential B¨uchi arithmetic of base p. However, M<sup>p</sup> being p-regular implies that M<sup>p</sup> is definable by a Σ2-formula of B¨uchi arithmetic of base p.

We also show that existential B¨uchi arithmetic defines all regular languages of polynomial density, encoded as sets of integers. Given a language L ⊆ Σ∗, let the counting function <sup>d</sup><sup>L</sup> : <sup>N</sup> <sup>→</sup> <sup>N</sup> be such that <sup>d</sup>L(n) := #(<sup>L</sup> <sup>∩</sup> <sup>Σ</sup><sup>n</sup>). Szilard et al. [11] say that L has *polynomial density* whenever dL(n) is O(n<sup>c</sup>) for some non-negative integer c. If moreover L is regular then Szilard et al. show that L is represented as a finite union of regular expressions of the form v0w<sup>∗</sup> <sup>1</sup>v<sup>1</sup> ··· w<sup>∗</sup> <sup>k</sup>v<sup>k</sup> such that 0 ≤ k ≤ c + 1, v0, w1, v1,...,vk, w<sup>k</sup> ∈ Σ<sup>∗</sup> [11, Thm. 3]. We show that existential B¨uchi arithmetic defines any language represented by a regular expression v0w<sup>∗</sup> <sup>1</sup>v<sup>1</sup> ··· w<sup>∗</sup> <sup>k</sup>vk, which implies that existential B¨uchi arithmetic defines all regular languages of polynomial density.

#### **2 Preliminaries**

Given *<sup>v</sup>* = (v1,...,vd) <sup>∈</sup> <sup>Z</sup><sup>d</sup>, we denote by *v*<sup>∞</sup> the maximum norm of *<sup>v</sup>*, i.e., *v*<sup>∞</sup> = max{|v1|,..., <sup>|</sup>v<sup>d</sup>|}. For a matrix **<sup>A</sup>** <sup>∈</sup> <sup>Z</sup><sup>m</sup>×<sup>d</sup> with entries <sup>a</sup>i,j , 1 ≤ i ≤ m, 1 ≤ j ≤ d, we denote by **A**1,<sup>∞</sup> the one-infinity norm of **A**, i.e., **A**1,<sup>∞</sup> = max{|ai,1| + ··· + |ai,d| : 1 ≤ i ≤ m}.

Let Σ be an alphabet and w ∈ Σ∗, we denote by |w| the length of w. Given a set <sup>U</sup> <sup>⊆</sup> <sup>N</sup>, we denote by <sup>w</sup><sup>U</sup> := {w<sup>u</sup> : <sup>u</sup> <sup>∈</sup> <sup>U</sup>}. Thus, for example, <sup>w</sup><sup>∗</sup> <sup>=</sup> <sup>w</sup><sup>N</sup>.

For an integer p ≥ 2, let Σ<sup>p</sup> := {0,...,p − 1}. We view words over Σ<sup>p</sup> as numbers encoded in p-ary most-significant bit first encoding. Tuples of numbers of dimension n can be encoded as words over the alphabet Σ<sup>n</sup> <sup>p</sup> . For w = *<sup>v</sup>*<sup>m</sup> ··· *<sup>v</sup>*<sup>0</sup> <sup>∈</sup> (Σ<sup>n</sup> <sup>p</sup> )<sup>m</sup>+1, we denote by <sup>w</sup><sup>p</sup> <sup>∈</sup> <sup>N</sup><sup>n</sup> the <sup>n</sup>-tuple

$$\[w\]\_p := \sum\_{i=0}^m v\_i \cdot p^i \cdot$$

We furthermore define <sup>ε</sup><sup>p</sup> := 0. Note that -·<sup>p</sup> is not injective since, e.g., 01 and 001 both encode the number one. Given <sup>L</sup> <sup>⊆</sup> (Σ<sup>n</sup> <sup>p</sup> )∗, we define

$$\lbrack L\rbrack\_p := \{ \lbrack w \rbrack\_p : w \in L \} \subseteq \mathbb{N}^n\ .$$

*Automata.* A *deterministic automaton* is a tuple A = (Q, Σ, δ, q0, F), where


For states q, r <sup>∈</sup> <sup>Q</sup> and <sup>u</sup> <sup>∈</sup> <sup>Σ</sup>, we write <sup>q</sup> <sup>u</sup> −→ r if δ(q, u) = r, and extend −→ inductively to words by stipulating, for <sup>w</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> and <sup>u</sup> <sup>∈</sup> <sup>Σ</sup>, that <sup>q</sup> <sup>w</sup>·<sup>u</sup> −−→ <sup>r</sup> if there is <sup>s</sup> <sup>∈</sup> <sup>Q</sup> such that <sup>q</sup> <sup>w</sup> −→ <sup>s</sup> <sup>u</sup> −→ r. The *language of* A is defined as L(A) = {w ∈ Σ<sup>∗</sup> : q<sup>0</sup> w −→ q<sup>f</sup> , q<sup>f</sup> ∈ F}.

Note that *a priori* we allow automata to have infinitely many states and to have partially defined transition functions (due to the presence of ⊥ in the co-domain of δ). If Q is finite then we call A a *deterministic finite automaton (DFA)*, and if in addition Σ = Σ<sup>n</sup> <sup>p</sup> for some p ≥ 2 and n ≥ 1 then A is called a p*-automaton*. Throughout this paper, we assume, without loss of generality, that all states of a DFA are live, i.e., every state is reachable from the initial state and can reach an accepting state.

*Arithmetic theories.* As stated in the introduction, Presburger arithmetic is the first-order theory of the structure <sup>N</sup>, <sup>0</sup>, <sup>1</sup>, <sup>+</sup>, and B¨uchi arithmetic of base <sup>p</sup> the first-order theory of the extended structure <sup>N</sup>, <sup>0</sup>, <sup>1</sup>, <sup>+</sup>, V<sup>p</sup>. We write atomic formulas of Presburger arithmetic as *<sup>a</sup>* · *<sup>x</sup>* <sup>=</sup> <sup>c</sup>, where *<sup>a</sup>* = (a1,...,ad) with <sup>a</sup><sup>i</sup> <sup>∈</sup> <sup>Z</sup>, <sup>c</sup> <sup>∈</sup> <sup>Z</sup>, and *<sup>x</sup>* = (x1,...,xd) is a vector of unknowns. In B¨uchi arithmetic we additionally have atomic formulas Vp(x, y) for the unknowns x and y. For technical convenience, we assert that Vp(x, 0) never holds.<sup>3</sup> We write Φ(x) or Φ(*x*) to indicate that x or a vector of unknowns *x* occurs free in Φ. If there are further free variables in Φ, we assume them to be implicitly existentially quantified.

We may without loss of generality assume that no negation symbol occurs in a formula of B¨uchi arithmetic. First, we have ¬(*a* · *x* = c) ≡ *a* · *x* ≤ c − 1 ∨ *a* · *x* ≥ c + 1, and the order relation ≤ can easily be expressed by introducing an additionally existentially quantified variable. Moreover, we have

$$\neg V\_p(x, y) \equiv y = 0 \lor \exists z \colon V\_p(z, y) \land \neg (x = z) \dots$$

Finally, Pp(x) := Vp(x, x) denotes the macro asserting that x is a power of p. Given a formula Φ(*x*) of B¨uchi arithmetic of base p, we define

$$\left\lbrack \Phi(x) \right\rbrack\_p := \left\{ m \in \mathbb{N}^d : \Phi[m/x] \text{ is valid} \right\},$$

<sup>3</sup> Other conventions are possible, e.g., asserting that <sup>V</sup>p(x, 0) holds if and only if <sup>x</sup> = 1 as in [3], but this does not change the sets of numbers definable in B¨uchi arithmetic.

where, for *m* = (m1,...,md) and *x* = (x1,...,xd), Φ[*m*/*x*] is the formula obtained from replacing every x<sup>i</sup> by m<sup>i</sup> in Φ. The set of sets of numbers definable in Presburger arithmetic is denoted by

> **PA** := {-<sup>Φ</sup>(x) : <sup>Φ</sup>(x) is a formula of Presburger arithmetic} .

Analogously, we define the sets of numbers definable in fragments of B¨uchi arithmetic of base p with a fixed number of quantifier-alternations as

<sup>Σ</sup>i-**BA**<sup>p</sup> := {-<sup>Φ</sup>(x)<sup>p</sup> : <sup>Φ</sup>(x) is a <sup>Σ</sup>i-formula of B¨uchi arithmetic of base <sup>p</sup>} .

Finally, **BA**<sup>p</sup> := - <sup>i</sup>≥<sup>1</sup> <sup>Σ</sup>i-**BA**<sup>p</sup> denotes the sets of numbers definable in B¨uchi arithmetic of base p.

For separating existential B¨uchi arithmetic from full B¨uchi arithmetic, we employ some tools from enumerative combinatorics. As defined in [15], a formula of *parametric Presburger arithmetic* with parameter t is a formula of Presburger arithmetic Φ<sup>t</sup> in which atomic formulas are of the form *a* · *x* = c(t), where c(t) is a univariate polynomial with indeterminate <sup>t</sup> and coefficients in <sup>Z</sup>. For <sup>n</sup> <sup>∈</sup> <sup>N</sup>, we denote by Φ<sup>n</sup> the formula of Presburger arithmetic obtained from replacing c(t) in every atomic formula of Φ<sup>t</sup> by the value of c(n). We associate to a formula <sup>Φ</sup>t(*x*) the counting function #Φt(*x*): <sup>N</sup> <sup>→</sup> <sup>N</sup> ∪ {∞} such that

$$\#\Phi\_t(x)(n) := \#[\Phi\_n(x)].$$

Throughout this paper, we constraint ourselves to formulas Φt(*x*) of parametric Presburger arithmetic in which c(t) is the identity function and #Φt(*x*)(n) is finite for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>.

**Definition 1.** *A function* <sup>f</sup> : <sup>N</sup> <sup>→</sup> <sup>Q</sup> *is an* eventual quasi-polynomial *if there exist a threshold* <sup>t</sup> <sup>∈</sup> <sup>N</sup> *and polynomials* <sup>p</sup>0,...,p<sup>m</sup>−<sup>1</sup> <sup>∈</sup> <sup>Q</sup>[x] *such that for all* n>t*,* f(n) = pi(n) *whenever* n ≡ i mod m*.*

Given an eventual quasi-polynomial f with threshold t and n>t, we denote by f<sup>n</sup> the polynomial p<sup>i</sup> such that n ≡ i mod m. We say that the polynomials p0,...,p<sup>m</sup>−<sup>1</sup> *constitute* the eventual quasi-polynomial f. A result by Woods [15, Thm. 3.5(b)] shows that the counting functions associated to parametric Presburger formulas as defined above are eventual quasi-polynomial.

**Proposition 1 (Woods).** *Let* Φt(*x*) *be a formula of parametric Presburger arithmetic. Then* #Φt(*x*) *is an eventual quasi-polynomial.*

*Semi-linear sets.* A result by Ginsburg and Spanier establishes that the sets of numbers definable in Presburger arithmetic are semi-linear sets [7]. A *linear set* in dimension <sup>d</sup> is given by a base vector *<sup>b</sup>* <sup>∈</sup> <sup>N</sup><sup>d</sup> and a finite set of period vectors <sup>P</sup> <sup>=</sup> {*p*1,..., *<sup>p</sup>*<sup>n</sup>} ⊆ <sup>N</sup><sup>d</sup> and defines the set

$$L(\mathbf{b}, P) := \{ \mathbf{b} + \lambda\_1 \cdot \mathbf{p}\_1 + \dots + \lambda\_n \cdot \mathbf{p}\_n : \lambda\_i \in \mathbb{N}, 1 \le i \le n \}.$$

<sup>A</sup> *semi-linear set* is a finite union of linear sets. For a finite <sup>B</sup> <sup>⊆</sup> <sup>N</sup><sup>d</sup>, we write L(B,P) for - *<sup>b</sup>*∈<sup>B</sup> <sup>L</sup>(*b*, P). Semi-linear sets of the form <sup>L</sup>(B,P) are called hybrid linear sets in [5], and it is known that the set of non-negative integer solutions of a system of linear Diophantine inequalities S : **A** · *x* ≥ *c* is a hybrid linear set [5].

Semi-linear sets in dimension one are also known as *ultimately periodic sets*. In this paper, we represent an ultimately periodic set as a four-tuple U = (t, , B, R), where t ≥ 0 is a *threshold*, > 0 is a *period*, B ⊆ {0,...,t − 1} and R ⊆ {0,..., − 1}, and U defines the set

$$\mathbb{E}\left[U\right] := B \cup \{t+r+\ell \cdot i : r \in R, i \ge 0\}\dots$$

### **3 The inexpressiveness of existential B¨uchi arithmetic**

We now establish the main result of this paper and show that the existential fragment of B¨uchi arithmetic is strictly less expressive than general B¨uchi arithmetic.

**Theorem 1.** *For any base* p ≥ 2*,* Σ1*-***BA**<sup>p</sup> = **BA**p*. In particular, there exists a fixed regular language* <sup>L</sup> ⊆ {0, <sup>1</sup>}<sup>∗</sup> *such that* -<sup>L</sup><sup>p</sup> <sup>∈</sup> **BA**<sup>p</sup> \ <sup>Σ</sup>1*-***BA**<sup>p</sup> *for every base* p ≥ 2*.*

Given a set <sup>M</sup> <sup>⊆</sup> <sup>N</sup>, recall that for a fixed base <sup>p</sup> <sup>≥</sup> 2, <sup>d</sup>M(n) counts the numbers of bit-length n in base p in M. As already discussed in the introduction, we prove Theorem 1 by characterizing the growth of d<sup>M</sup> for sets M definable in B¨uchi arithmetic.

For any formula Φ(x) of existential B¨uchi arithmetic in prenex normal form, we can with no loss of generality assume that its matrix is in disjunctive normal form, i.e., a disjunction of *systems of linear Diophantine equations with valuation constraints*, each of the form

$$\mathbf{A} \cdot x = \mathbf{c} \wedge \bigwedge\_{i \in I} V\_p(x\_i, y\_i),$$

where the x<sup>i</sup> and y<sup>i</sup> are unknowns from the vector of unknowns *x*. For M = -<sup>Φ</sup>(x)<sup>p</sup>, in order to determine the growth of <sup>d</sup>M, it suffices to determine the maximum growth occurring in any of its systems of linear Diophantine equations with valuation constraints in the matrix of Φ(x), which in turn can be obtained by analyzing the growth of the number of words accepted by a p-automaton defining the set of solutions of such a system.

Let S : **A** · *x* = *c* be a system of linear Diophantine equations such that, throughout this section, **A** is an m × d integer matrix, and fix a base p ≥ 2. Following Wolper and Boigelot [14], we define an automaton A := (Q, Σ<sup>d</sup> <sup>p</sup> , δ, *q*0, F) whose language encodes all solutions of S over the alphabet Σp:

$$\begin{array}{l} -\
Q := \mathbb{Z}^m, \\ -\
\delta(\boldsymbol{q}, \boldsymbol{u}) := \boldsymbol{p} \cdot \boldsymbol{q} + \mathbf{A} \cdot \boldsymbol{u} \text{ for all } \boldsymbol{q} \in Q \text{ and } \boldsymbol{u} \in \Sigma\_p^d, \\ -\
\boldsymbol{q}\_0 := \mathbf{0}, \text{ and} \\ -\
F := \{\mathbf{c}\}. \end{array}$$

As discussed in [14], see also [8], only states *q* such that *q*<sup>∞</sup> ≤ **A**1,<sup>∞</sup> and *q*<sup>∞</sup> ≤ *c*<sup>∞</sup> can reach the accepting state. Hence, all words <sup>w</sup> <sup>∈</sup> (Σ<sup>d</sup> <sup>p</sup> )<sup>∗</sup> such that **<sup>A</sup>** · <sup>w</sup> <sup>=</sup> *<sup>c</sup>* only visit a finite number of states of <sup>A</sup>, and to obtain the p-automaton A(S) defining the sets of solutions of S we subsequently restrict Q to only such states. The following lemma recalls an algebraic characterization of the reachability relation of A(S) established in the proof of Proposition 14 in [8].

**Lemma 1.** *Let <sup>q</sup>*, *<sup>r</sup>* <sup>∈</sup> <sup>Z</sup><sup>m</sup> *be states of* <sup>A</sup>(S)*,* <sup>w</sup> <sup>∈</sup> (Σ<sup>d</sup> <sup>p</sup> )<sup>n</sup> *and <sup>x</sup>* <sup>=</sup> <sup>w</sup><sup>p</sup>*. Then <sup>q</sup>* <sup>w</sup> −→ *<sup>r</sup> if and only if there is* <sup>y</sup> <sup>∈</sup> <sup>N</sup> *such that*

$$q = r \cdot y + \mathbf{A} \cdot x,\ \|x\|\_{\infty} < y, \ y = p^n.$$

Let <sup>x</sup> be a distinguished variable of *<sup>x</sup>*. For a word <sup>w</sup> <sup>∈</sup> (Σ<sup>d</sup> <sup>p</sup> )<sup>∗</sup> encoding solutions of S, denote by πx(w) the word v ∈ Σ<sup>∗</sup> <sup>p</sup> obtained from projecting w onto the component of w corresponding to x. Let q be a state of a p-automaton <sup>A</sup>, define the counting function <sup>C</sup>q,x : <sup>N</sup> <sup>→</sup> <sup>N</sup> as

$$C\_{q,x}(n) := \# \left\{ \pi\_x(w) : q \xrightarrow{w} q, w \in (\Sigma\_p^d)^n \right\}.$$

We now show that for p-automata arising from systems of linear Diophantine equations, Cq,x can be obtained from an eventual quasi-polynomial.

**Lemma 2.** *For the* p*-automaton* A(S) *associated to* S : **A** · *x* = *c with states* Q *and all* q ∈ Q*, there is an eventual quasi-polynomial* f *such that* Cq,x(n) = <sup>f</sup>(p<sup>n</sup>) *for all* <sup>n</sup> <sup>∈</sup> <sup>N</sup>*. Moreover, for all sufficiently large* <sup>n</sup> <sup>∈</sup> <sup>N</sup>*,* <sup>f</sup>p<sup>n</sup> *is a linear polynomial.*

*Proof.* Let <sup>q</sup> <sup>=</sup> *<sup>q</sup>* <sup>∈</sup> <sup>Z</sup><sup>d</sup>. By Lemma 1, *<sup>q</sup>* <sup>w</sup> −→ *<sup>q</sup>* for <sup>w</sup> <sup>∈</sup> (Σ<sup>d</sup> <sup>p</sup> )<sup>n</sup> if and only if there is a <sup>y</sup> <sup>∈</sup> <sup>N</sup> such that

$$q = q \cdot y + \mathbf{A} \cdot x,\ \|x\|\_{\infty} < y, \ y = p^n,$$

where *<sup>x</sup>* <sup>=</sup> <sup>w</sup><sup>p</sup>. The set of solutions of <sup>S</sup> : **<sup>A</sup>** · *<sup>x</sup>* <sup>+</sup> *<sup>q</sup>* · <sup>y</sup> <sup>=</sup> *<sup>q</sup>*, *x*<sup>∞</sup> < y is a hybrid linear set <sup>L</sup>(D, R) <sup>⊆</sup> <sup>N</sup><sup>d</sup>+1. Let <sup>L</sup>(B,P) <sup>⊆</sup> <sup>N</sup><sup>2</sup> be obtained from <sup>L</sup>(D, R) by projecting onto the components corresponding to x and y, and assume that x corresponds to the first and y to the second component of L(B,P). Let M<sup>t</sup> := <sup>N</sup> × {t} and

$$f(t) := \# (L(B, P) \cap M\_t) \;.$$

Observe that <sup>C</sup>q,x(n) = <sup>f</sup>(p<sup>n</sup>) and that <sup>f</sup>(n) is finite for all <sup>n</sup> <sup>∈</sup> <sup>N</sup> due to the constraint x<y. Let P = {*p*1,..., *p*<sup>k</sup>}, the following formula of parametric Presburger arithmetic defines L(B,P) ∩ Mt:

$$\Phi\_t(x, y) := \exists z\_1 \cdots \exists z\_k \colon \bigvee\_{\mathbf{b} \in B} \begin{pmatrix} x \\ y \end{pmatrix} = \mathbf{b} + \sum\_{i=1}^k p\_i \cdot z\_i \wedge y = t\_k$$

Thus, f = #Φt(x, y) and, by application of Proposition 1, f is an eventual quasi-polynomial.

Since <sup>C</sup>q,x(n) <sup>≤</sup> <sup>p</sup><sup>n</sup>−1 for all <sup>n</sup> <sup>∈</sup> <sup>N</sup>, we in particular have that all polynomials fp<sup>n</sup> constituting f are linear as they would otherwise outgrow Cq,x.

The next step is to lift Lemma 2 to systems of linear Diophantine equations with valuation constraints. To this end, we define a DFA whose language encodes the set of all solutions of predicates of the form Vp(x, y). Formally, for S : Vp(x, y) we define A(S) := (Q, Σ<sup>d</sup> <sup>p</sup> , δ, q0, F) such that

**–** Q := {0, 1}, **–** <sup>δ</sup>(0,*u*) := 0 for all *<sup>u</sup>* <sup>∈</sup> <sup>Σ</sup><sup>d</sup> <sup>p</sup> such that πx(*u*) = 0, **–** <sup>δ</sup>(0,*u*) := 1 for all *<sup>u</sup>* <sup>∈</sup> <sup>Σ</sup><sup>d</sup> <sup>p</sup> such that πx(*u*) = 1 and πy(*u*) > 0, **–** <sup>δ</sup>(1,*u*) := 1 for all *<sup>u</sup>* <sup>∈</sup> <sup>Σ</sup><sup>d</sup> <sup>p</sup> such that πx(*u*) = πy(*u*) = 0, **–** q<sup>0</sup> := 0, and **–** F := {1}.

For S : **A** · *x* = *c* ∧ <sup>1</sup>≤i<sup>≤</sup> <sup>V</sup>p(xi, yi), we denote by <sup>A</sup>(S) the DFA that can be obtained from the standard product construction on all DFA for the atomic formulas of <sup>S</sup>. Hence, the set of states of <sup>A</sup>(S) is a finite subset of <sup>Z</sup><sup>m</sup>×{0, <sup>1</sup>} . We now show that the number of words along a cycle of A(S) can also be obtained from an eventual quasi-polynomial.

**Lemma 3.** *Let* S *be a system of linear Diophantine equations with valuation constraints with the associated DFA* A(S) *with states* Q*, and let* q ∈ Q*. There is an eventual quasi-polynomial* f *such that* Cq,x(n) = f(p<sup>n</sup>)*. Moreover,* f<sup>p</sup><sup>n</sup> *is a linear polynomial for all* <sup>n</sup> <sup>∈</sup> <sup>N</sup>*.*

*Proof.* Let S : **A** · *x* = *c* ∧ <sup>1</sup>≤i<sup>≤</sup> <sup>V</sup>p(xi, yi), we have <sup>Q</sup> <sup>⊆</sup> <sup>Z</sup><sup>m</sup> × {0, <sup>1</sup>} and thus <sup>q</sup> = (*q*, b1,...,b ) <sup>∈</sup> <sup>Q</sup>. Any self-loop <sup>q</sup> <sup>w</sup> −→<sup>S</sup> q with q = (*q*, b1,...,b ) is a self-loop for the DFA induced by the system of linear Diophantine equations **<sup>A</sup>** · *<sup>x</sup>* <sup>=</sup> *<sup>c</sup>* with the additional requirement that <sup>π</sup><sup>x</sup><sup>i</sup> (<sup>w</sup><sup>p</sup>) = 0 for all 1 <sup>≤</sup> <sup>i</sup> <sup>≤</sup> and furthermore <sup>π</sup><sup>y</sup><sup>i</sup> (<sup>w</sup><sup>p</sup>) = 0 whenever <sup>b</sup><sup>i</sup> = 1. Thus (*q*, **<sup>0</sup>**) <sup>w</sup> −→<sup>S</sup>-(*q*, **0**) where

$$S' \colon \mathbf{A} \cdot \mathbf{z} = \mathbf{c} \land \bigwedge\_{1 \le i \le \ell} x\_i = 0 \land \bigwedge\_{1 \le i \le \ell, b\_i = 1} y\_i = 0 \dots$$

Conversely, (*q*, **<sup>0</sup>**) <sup>w</sup> −→<sup>S</sup>- (*q*, **<sup>0</sup>**) immediately gives <sup>q</sup> <sup>w</sup> −→<sup>S</sup> q. The statement is now an immediate consequence of the application of Lemma 2 to S .

We will from now on implicitly apply Lemma 3. As a first application, we show that Lemma 3 allows us to classify the DFA associated to a system of linear Diophantine equations with valuation constraints.

**Lemma 4.** *The DFA* A(S) *associated to a system of linear Diophantine equations with valuation constraints* S *with states* Q *has either of the following properties:*


*Proof.* Suppose A(S) has Property (i). For a contradiction, suppose d ≥ 0 exists. Let f be the eventual quasi-polynomial from Property (i). Every non-constant polynomial fp<sup>n</sup> constituting f is of the form a · x + b with a > 0. As there are infinitely many such n, there is some linear polynomial g(x) = a ·x+b such that <sup>g</sup> <sup>=</sup> <sup>f</sup>p<sup>n</sup> for infinitely many <sup>n</sup> <sup>∈</sup> <sup>N</sup>. Hence <sup>g</sup>(p<sup>n</sup>) > d for some sufficiently large <sup>n</sup> <sup>∈</sup> <sup>N</sup>.

For the converse, suppose that A(S) does not have Property (i). Then there are , m > 0 such that all fp<sup>n</sup> are constant polynomials bounded by some value <sup>m</sup> <sup>∈</sup> <sup>N</sup> for all <sup>n</sup> <sup>≥</sup> , q ∈ Q and f = Cq,x. Hence we can choose d = max({Cq,x(n) : q ∈ Q, 0 < n ≤ }∪{m}).

We are now in a position to prove a dichotomy of the growth of the number of words accepted by a DFA corresponding to a system of linear Diophantine equations with valuation constraints.

**Lemma 5.** *Let* S *be a fixed system of linear Diophantine equations with valuation constraints with the associated DFA* A(S)*. Let* L = πx(L(A(S)))*, then either*

*(i)* <sup>d</sup>L(n) <sup>≥</sup> <sup>c</sup> · <sup>p</sup><sup>n</sup> *for some fixed constant* c > <sup>0</sup> *and infinitely many* <sup>n</sup> <sup>∈</sup> <sup>N</sup>*; or (ii)* <sup>d</sup>L(n) = <sup>O</sup>(n<sup>c</sup>) *for some fixed constant* <sup>c</sup> <sup>≥</sup> <sup>0</sup>*.*

*Proof.* Let A(S) have the set of states Q, initial state q<sup>0</sup> and final state q<sup>f</sup> . The DFA A(S) has one of the two properties stated in Lemma 4.

If A(S) has the Property (i) of Lemma 4 then consider q ∈ Q such that Cq,x is an eventual quasi-polynomial f such that fp<sup>n</sup> is non-constant for infinitely many <sup>n</sup> <sup>∈</sup> <sup>N</sup>, and let <sup>i</sup><sup>1</sup> < i<sup>2</sup> < ... <sup>∈</sup> <sup>N</sup> be such that all <sup>f</sup>pij are the same non-constant polynomial a · x + b. Consider v and w such that q<sup>0</sup> v −→ <sup>q</sup> <sup>w</sup> −→ q<sup>f</sup> . Then for all sufficiently large j we have

$$d\_L(i\_j + |v| + |w|) \ge a \cdot p^{i\_j} + b \ge c \cdot p^{(i\_j + |v| + |w|)}$$

for some fixed constant c > 0.

Otherwise, A(S) has the Property (ii) of Lemma 4, and there is some fixed <sup>d</sup> <sup>≥</sup> 0 such that <sup>C</sup>q,x(n) <sup>≤</sup> <sup>d</sup> for all <sup>n</sup> <sup>∈</sup> <sup>N</sup> and <sup>q</sup> <sup>∈</sup> <sup>Q</sup>. Every <sup>w</sup> <sup>∈</sup> <sup>L</sup> such that |w| = n can uniquely be decomposed as w = v0w1v1w<sup>2</sup> ··· wkv<sup>k</sup> for some k ≤ |Q| such that

$$q\_0 \xrightarrow{v\_0} q\_{a\_1} \xrightarrow{w\_1} q\_{a\_1} \xrightarrow{v\_1} q\_{a\_2} \xrightarrow{w\_2} q\_{a\_2} \xrightarrow{v\_2} q\_{a\_3} \cdots \xrightarrow{w\_k} q\_{a\_k} \xrightarrow{v\_k} q\_{a\_{k+1}},\tag{1}$$

where q<sup>a</sup>k+1 = q<sup>f</sup> , q<sup>a</sup><sup>i</sup> = q<sup>a</sup><sup>j</sup> for all i = j and each q<sup>a</sup><sup>i</sup> vi −→ q<sup>a</sup>i+1 corresponds to a loop-free path in <sup>A</sup>(S). Since <sup>C</sup>q,x <sup>≤</sup> <sup>d</sup>, there are at most <sup>d</sup><sup>k</sup> <sup>≤</sup> <sup>d</sup>(#Q) words u ∈ L of length n that have the same sequence of states in the decomposition of Eq. (1) at the same position where they occur in w. Moreover, there are at most n 2k <sup>≤</sup> <sup>n</sup> <sup>2</sup>·#<sup>Q</sup> <sup>≤</sup> <sup>n</sup>(2·#Q) possibilities at which the states <sup>q</sup><sup>a</sup><sup>i</sup> can appear in any u ∈ L of length n for any particular sequence of states in the decomposition of Eq. (1). Finally, there are at most (#Q)(#Q) such sequences. We thus derive

$$d\_L(n) \le (\#Q)^{\#Q} \cdot n^{(2\cdot \#Q)} \cdot d^{(\#Q)} = O(n^c),$$

for some constant c ≥ 0.

**Corollary 1.** *Let* Φ(x) *be a fixed formula of existential B¨uchi arithmetic of base* <sup>p</sup> <sup>≥</sup> <sup>2</sup>*. Let* <sup>M</sup> <sup>=</sup> -<sup>Φ</sup>(x)<sup>p</sup>*, then either:*

*(i)* <sup>d</sup>M(n) <sup>≥</sup> <sup>c</sup> · <sup>p</sup><sup>n</sup> *for some fixed constant* c > <sup>0</sup> *and infinitely many* <sup>n</sup> <sup>∈</sup> <sup>N</sup>*; or (ii)* <sup>d</sup>M(n) = <sup>O</sup>(n<sup>c</sup>) *for some fixed constant* <sup>c</sup> <sup>≥</sup> <sup>0</sup>*.*

*Proof.* Without loss of generality we may assume that Φ(x) is in disjunctive normal form such that Φ(x) = <sup>i</sup>∈<sup>I</sup> <sup>Φ</sup>i(x) and each <sup>Φ</sup>i(x) is a system of linear Diophantine equations with valuation constraints <sup>S</sup>i. For <sup>M</sup><sup>i</sup> <sup>=</sup> -<sup>Φ</sup>i(x)<sup>p</sup>, we obtain d<sup>M</sup><sup>i</sup> by application of Lemma 5. If there is a constant c ≥ 0 such that <sup>d</sup><sup>M</sup><sup>i</sup> <sup>=</sup> <sup>O</sup>(n<sup>c</sup>) for all <sup>i</sup> <sup>∈</sup> <sup>I</sup> then <sup>d</sup><sup>M</sup> <sup>=</sup> <sup>O</sup>(n<sup>c</sup>). Otherwise, if there is some <sup>i</sup> <sup>∈</sup> <sup>I</sup> such that <sup>d</sup><sup>M</sup><sup>i</sup> (n) <sup>≥</sup> <sup>c</sup> · <sup>p</sup><sup>n</sup> for some constant c > 0 and infinitely many <sup>n</sup> <sup>∈</sup> <sup>N</sup> then <sup>d</sup>M(n) <sup>≥</sup> <sup>c</sup> · <sup>p</sup><sup>n</sup> for infinitely many <sup>n</sup> <sup>∈</sup> <sup>N</sup>.

As an immediate consequence of Corollary 1, we obtain:

**Corollary 2.** *Let* <sup>p</sup> <sup>≥</sup> <sup>2</sup> *and* <sup>M</sup> <sup>⊆</sup> <sup>N</sup> *such that* <sup>f</sup> <sup>=</sup> <sup>o</sup>(dM) *for any* <sup>f</sup> <sup>=</sup> <sup>O</sup>(n<sup>c</sup>)*,* <sup>c</sup> <sup>≥</sup> <sup>0</sup>*, and* <sup>d</sup><sup>M</sup> <sup>=</sup> <sup>o</sup>(p<sup>n</sup>)*. Then* <sup>M</sup> ∈ <sup>Σ</sup>1*-***BA**p*.*

For any p ≥ 2, consider L = {01, 10}<sup>∗</sup> ⊆ Σ<sup>∗</sup> <sup>p</sup> and <sup>M</sup> <sup>=</sup> -<sup>L</sup><sup>p</sup>. We have <sup>d</sup>M(n) = <sup>Θ</sup>(2n/<sup>2</sup>), and thus Corollary <sup>2</sup> yields <sup>M</sup> ∈ <sup>Σ</sup>1-**BA**p. However, since <sup>M</sup> is <sup>p</sup>regular, we have M ∈ **BA**p. This concludes the proof of Theorem 1.

### **4 Expressive completeness of the** *Σ***2-fragment of B¨uchi arithmetic**

For a regular language <sup>L</sup> <sup>⊆</sup> (Σ<sup>d</sup> <sup>p</sup> )<sup>∗</sup> given by a DFA, Villemaire shows in the proof of Theorem 2.2 in [13] how to construct a Σ3-formula of B¨uchi arithmetic <sup>Φ</sup>L(*x*) such that -<sup>Φ</sup>L(*x*)<sup>p</sup> <sup>=</sup> -<sup>L</sup><sup>p</sup>. This construction is modularized and relies on an existential formula Φp,j (x, y) expressing that *"*x *is a power of* p *and the coefficient of this power of* p *in the representation of* y *in base* p *is* j*"*:

$$\begin{aligned} \Phi\_{p,j}(x,y) \equiv P\_p(x) \land \exists t \, \exists u \, \exists z \colon \begin{aligned} (y=z+j \cdot x+t) \land (z$$

The only reason why ΦL(*x*) in [13] is a Σ3-formula is that Φp,j (x, y) appears in an implication both as antecedent and as consequent inside an existential formula. Thus, if one could additionally define Φp,j (x, y) by a Π1-formula then ΦL(*x*) immediately becomes a Σ2-formula. That is, however, not difficult to achieve by defining:

$$\begin{aligned} \Phi\_{p,j}(x,y) &:= P\_p(x) \land \forall s \; \forall t \; \forall u \; \forall z \; : \\ \left(\neg(s = z + j \cdot x + t) \lor (z \ge x) \lor (\neg V\_p(u,t) \lor x \ge u) \land \neg(t = 0)\right) \; \right) &\to \neg(s = y) \; . \end{aligned}$$

Note that the order relation can also be expressed by a universal formula: x ≤ y if and only if <sup>∀</sup><sup>z</sup> : (<sup>y</sup> <sup>+</sup><sup>z</sup> <sup>=</sup> <sup>x</sup>) <sup>→</sup> (<sup>z</sup> = 0). Thus, <sup>Φ</sup>p,j (x, y) is indeed a <sup>Π</sup><sup>1</sup> formula.

Combining <sup>Φ</sup>p,j (x, y) with the results in [13], we obtain that the <sup>Σ</sup>2-fragment of B¨uchi arithmetic is expressively complete.

**Theorem 2.** *For any base* p ≥ 2*,* Σ2*-***BA**<sup>p</sup> = **BA**p*.*

### **5 Existential B¨uchi arithmetic defines regular languages of polynomial growth**

For a language L ⊆ Σ∗, Szilard et al. [11] say that L has *polynomial growth* if <sup>d</sup>L(n) = <sup>O</sup>(n<sup>c</sup>) for some constant <sup>c</sup> <sup>≥</sup> 0 and all <sup>n</sup> <sup>∈</sup> <sup>N</sup>. One of the main results of [11] is that a regular language L has polynomial growth if and only if L can be represented as a finite union of regular expressions of the form

$$
v\_0 w\_1^\* v\_1 \cdots v\_{k-1} w\_k^\* v\_k \,. \tag{2}$$

Denote by

**PREG**<sup>p</sup> := -<sup>L</sup><sup>p</sup> : <sup>L</sup> <sup>⊆</sup> <sup>Σ</sup><sup>∗</sup> <sup>p</sup> , L is a regular language of polynomial growth

the numerical encoding of all regular languages of polynomial growth in base p. We show in this section that existential B¨uchi arithmetic defines any regular language of the form in Eq. (2). This immediately gives the following theorem.

**Theorem 3.** *For any base* p ≥ 2*,* **PREG**<sup>p</sup> ⊆ Σ1*-***BA**p*.*

We first require a couple of abbreviations. Define

$$W\_p(x,y) := P\_p(y) \land x < y \le p \cdot x,$$

which expresses that y is the smallest power of p strictly greater than x.

Let > 0, Lohrey and Zetzsche introduce in [9] the predicate S (x, y) which holds whenever

$$x = p^r \text{ and } y = p^{r + \ell \cdot i} \text{ for some } i, r \ge 0 \dots$$

They show that S (x, y) is definable in existential B¨uchi arithmetic. Since y = <sup>p</sup> ·<sup>i</sup> · <sup>x</sup> if and only if <sup>y</sup> <sup>≡</sup> <sup>x</sup> mod (<sup>p</sup> <sup>−</sup> 1), one can obtain <sup>S</sup> as

$$S\_\ell(x, y) := P\_p(x) \land P\_p(y) \land \exists z \colon (y - x = (p^\ell - 1) \cdot z) \land y \ge x \dots$$

We slightly generalize <sup>S</sup> . Let <sup>U</sup> <sup>⊆</sup> <sup>N</sup>, define the predicate <sup>S</sup><sup>U</sup> (x, y) to hold whenever

<sup>x</sup> <sup>=</sup> <sup>p</sup><sup>r</sup> and <sup>y</sup> <sup>=</sup> <sup>p</sup><sup>r</sup>+<sup>u</sup> for some <sup>r</sup> <sup>≥</sup> 0 and <sup>u</sup> <sup>∈</sup> U .

**Lemma 6.** *For any ultimately periodic set* <sup>U</sup> <sup>⊆</sup> <sup>N</sup>*, the predicate* <sup>S</sup><sup>U</sup> (x, y) *is definable in existential B¨uchi arithmetic*

*Proof.* Suppose that U is given as (t, , B, R), we define

$$S\_U(x, y) := P\_p(x) \land P\_p(y) \land \bigvee\_{b \in B} y = p^b \cdot x \lor \bigvee\_{r \in R} S\_\ell(p^{t+r} \cdot x, y) \,.$$

Towards proving Theorem 3, we now show that we can define <sup>w</sup><sup>∗</sup><sup>p</sup> for any w ∈ Σp.

**Lemma 7.** *For any* w ∈ Σ<sup>∗</sup> p *,* <sup>w</sup><sup>∗</sup><sup>p</sup> *is definable by a formula of existential B¨uchi arithmetic* Φ<sup>w</sup><sup>∗</sup> (x)*.*

*Proof.* Let <sup>m</sup> <sup>=</sup> <sup>p</sup> be the smallest power of <sup>p</sup> greater than <sup>w</sup><sup>p</sup>. Then for any k > 0,

$$\mathbb{I}[w^k]\_p = \mathbb{I}[w]\_p \cdot \sum\_{i=0}^{k-1} m^i = \mathbb{I}[w]\_p \cdot \frac{m^k - 1}{m - 1} \cdot \mathbb{I}$$

It follows that <sup>w</sup><sup>∗</sup><sup>p</sup> is defined by

$$\Phi\_{w^\ast}(x) := x = 0 \vee \exists y \colon S\_\ell(m, y) \wedge (m - 1) \cdot x = \lceil w \rceil\_p \cdot (y - 1) \text{ .}$$

Building upon Lemma 7, we now show that, for any <sup>w</sup> <sup>∈</sup> <sup>Σ</sup>p, we can define <sup>w</sup><sup>+</sup><sup>p</sup> shifted to the left by a number of zeros specified by an ultimately periodic set.

**Lemma 8.** *Let* w ∈ Σ<sup>∗</sup> <sup>p</sup> *and* <sup>U</sup> *be an ultimately periodic set. Then* <sup>w</sup><sup>+</sup>0<sup>U</sup> <sup>p</sup> *is definable by a formula of existential B¨uchi arithmetic* ΦU,w<sup>+</sup> (x)*.*

*Proof.* The case w ∈ 0<sup>∗</sup> is trivial. Thus, let w = w · w<sup>0</sup> such that w ∈ Σ<sup>∗</sup> <sup>p</sup> ·(Σ<sup>p</sup> \ {0}) and <sup>w</sup><sup>0</sup> <sup>∈</sup> <sup>0</sup>∗. Observe that for i<j, <sup>w</sup><sup>j</sup> <sup>p</sup> <sup>−</sup> wi <sup>p</sup> <sup>=</sup> w<sup>j</sup>−<sup>i</sup> 0i <sup>p</sup>. We define

$$\begin{aligned} \Phi\_{U, w^+} (x) \coloneqq \exists y \, \exists z \colon y < z \land \Phi\_{w^\*} (y) \land \Phi\_{w^\*} (z) \land \bigvee\_{\substack{0 \le i < |w|}} x = p^i \cdot (z - y) \land \\ \land \exists s \, \exists t \colon S\_U (1, s) \land V\_p (t, x) \land t = p^{|w\_0| + 1} \cdot s \, . \end{aligned}$$

The first line defines the set <sup>w</sup><sup>+</sup>0<sup>∗</sup><sup>p</sup>, whereas the second line ensures that the tailing number of zeros is in the set U + |w0|.

We have now all the ingredients to prove the following key proposition.

**Proposition 2.** *Let* L = v0w<sup>∗</sup> <sup>1</sup>v<sup>1</sup> ··· v<sup>k</sup>−<sup>1</sup>w<sup>∗</sup> <sup>k</sup>vk*. Then* -<sup>L</sup><sup>p</sup> *is definable in existential B¨uchi arithmetic.*

*Proof.* The proposition follows from showing the statement for languages of the form

$$L' = v\_0 w\_1^+ v\_1 \cdots v\_{k-1} w\_k^+ v\_k \ .$$

We show the statement by induction on k. The induction base case k = 0 is trivial. For the induction step, assume that for M = v1w<sup>+</sup> <sup>2</sup> <sup>v</sup><sup>2</sup> ··· <sup>v</sup><sup>k</sup>−<sup>1</sup>w<sup>+</sup> <sup>k</sup> <sup>v</sup>k, -<sup>M</sup><sup>p</sup> is defined by a formula Φk(x) of existential B¨uchi arithmetic, and let v0, w<sup>1</sup> ∈ Σ<sup>∗</sup> p .

We first show how to define N = w<sup>+</sup> <sup>1</sup> v1w<sup>+</sup> <sup>2</sup> <sup>v</sup><sup>2</sup> ··· <sup>v</sup><sup>k</sup>−<sup>1</sup>w<sup>+</sup> <sup>k</sup> vk. To this end, factor M = M<sup>0</sup> · M , where M<sup>0</sup> ⊆ 0<sup>∗</sup> and M ⊆ (Σ<sup>p</sup> \ {0}) · Σ<sup>∗</sup> <sup>p</sup> . Observe that -M <sup>p</sup> <sup>=</sup> -<sup>Φ</sup>k(x)<sup>p</sup>, and that both <sup>U</sup> <sup>=</sup> {|w<sup>|</sup> : <sup>w</sup> <sup>∈</sup> <sup>M</sup>} and <sup>V</sup> <sup>=</sup> {|w<sup>|</sup> : <sup>w</sup> <sup>∈</sup> <sup>M</sup>0} are ultimately periodic sets, cf. [6,12]. We moreover assume that w<sup>1</sup> ∈ 0∗, otherwise we are done. Factor w<sup>1</sup> = w · w<sup>0</sup> such that w ∈ Σ<sup>∗</sup> <sup>p</sup> · (Σ<sup>p</sup> \ {0}) and w<sup>0</sup> ∈ 0∗. Recall that Wp(x, y) holds if and only if y is the smallest power of p strictly greater than x, and define

$$\begin{aligned} \Psi\_{k+1}(x) &:= \exists y \exists z \colon \Phi\_k(y) \land \Phi\_{U,w^+}(z) \land x = y + z \land \\ &\qquad \land \exists s \, \exists t \colon W\_p(y,s) \land S\_V(s,t) \land V\_p(p^{|w\_0|+1} \cdot t, z) \; .\end{aligned}$$

The first line composes <sup>x</sup> as the sum of some <sup>y</sup> <sup>∈</sup> -<sup>M</sup><sup>p</sup> and <sup>z</sup> <sup>∈</sup> <sup>w</sup><sup>+</sup>0<sup>U</sup> <sup>p</sup>. The second line ensures that the number of zeros between the leading bit of y and the last non-zero digit of z in their p-ary expansion is in V + |w0|. Thus, <sup>N</sup><sup>p</sup> <sup>=</sup> -<sup>Ψ</sup><sup>k</sup>+1(x).


$$\Phi\_{k+1}(x) \coloneqq \exists y \, \exists z \colon x = y + p \cdot z \cdot \llbracket v\_0 \rrbracket\_p \land \Psi\_{k+1}(y) \land \exists s \colon W\_p(y,s) \land S\_T(s,z) \;. \tag{7}$$

Since we can define any regular language of the form (2) in existential B¨uchi arithmetic via Proposition 2, we can define a finite union of such languages and thus define all regular languages of polynomial growth in existential B¨uchi arithmetic. This completes the proof of Theorem 3.

Note that **PREG**<sup>p</sup> ⊆ **PA** for any base <sup>p</sup> <sup>≥</sup> 2: since <sup>M</sup> <sup>=</sup> -<sup>Φ</sup>(x) is ultimately periodic for any formula <sup>Φ</sup>(x) of Presburger arithmetic, whenever -<sup>Φ</sup>(x) is infinite it follows that dM(n) = Ω(p<sup>n</sup>), i.e., not of polynomial growth.

#### **6 Conclusion**

The main result of this paper is that existential B¨uchi arithmetic is strictly less expressive than full B¨uchi arithmetic of any base. This is in contrast to Presburger arithmetic, for which it is known that its existential fragment is expressively complete.

When considered as the first-order theory of the structure <sup>N</sup>, <sup>0</sup>, <sup>1</sup>, <sup>+</sup>, Presburger arithmetic does not have a quantifier elimination procedure. The extended structure <sup>N</sup>, <sup>0</sup>, <sup>1</sup>, <sup>+</sup>, {c|·}c><sup>1</sup>, however, admits quantifier elimination. Those additional divisibility predicates are definable in existential Presburger arithmetic. Our main result shows that even if we extended the structure underlying B¨uchi arithmetic with predicates definable in existential B¨uchi arithmetic, the resulting first-order theory would not admit quantifier-elimination. On the positive side, Benedikt et al. [1, Thm. 3.1] give an extension of B¨uchi arithmetic which has quantifier elimination.

We conclude this paper with an interesting yet likely challenging open problem: Is it decidable whether a set definable in B¨uchi arithmetic is definable in existential B¨uchi arithmetic?

**Acknowledgments.** We would like to thank Dmitry Chistikov and Alex Fung for inspiring discussions on the topics of this paper, and the FoSSaCS'21 reviewers for their comments and suggestions.

This work is part of a project that has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (Grant agreement No. 852769, ARiAT).

### **References**


15. Woods, K.: The unreasonable ubiquitousness of quasi-polynomials. Elect. J. Combin. **21**(1), P1.44 (2014). https://doi.org/10.37236/3750

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Parametricity for Primitive Nested Types**

Patricia Johann -, Enrico Ghiorzi , and Daniel Jeffries Appalachian State University, Boone, NC, USA {johannp,ghiorzie,jeffriesd}@appstate.edu

**Abstract.** This paper considers parametricity and its resulting free theorems for nested data types. Rather than representing nested types via their Church encodings in a higher-kinded or dependently typed extension of System F, we adopt a functional programming perspective and design a Hindley-Milner-style calculus with primitives for constructing nested types directly as fixpoints. Our calculus can express all nested types appearing in the literature, including truly nested types. At the term level, it supports primitive pattern matching, map functions, and fold combinators for nested types. Our main contribution is the construction of a parametric model for our calculus. This is both delicate and challenging: to ensure the existence of semantic fixpoints interpreting nested types, and thus to establish a suitable Identity Extension Lemma for our calculus, our type system must explicitly track functoriality of types, and cocontinuity conditions on the functors interpreting them must be appropriately threaded throughout the model construction. We prove that our model satisfies an appropriate Abstraction Theorem and verifies all standard consequences of parametricity for primitive nested types.

### **1 Introduction**

*Algebraic data types* (ADTs), both built-in and user-defined, have long been at the core of functional languages such as Haskell, ML, Agda, Epigram, and Idris. ADTs, such as that of natural numbers, can be unindexed. But they can also be indexed over other types. For example, the ADT of lists (here coded in Agda)

$$\begin{array}{rcl} \textbf{data List (A : Set) } : \textbf{Set where} \\ \textbf{nil : List A} \\ \textbf{cons } : \textbf{A} \to \textbf{List A} \to \textbf{List A} \end{array}$$

is indexed over its element type A. The instance of List at index A depends only on itself, and so is independent of List B for any other index B. That is, List, like all other ADTs, defines a *family of inductive types*, one for each index type.

Over time, there has been a notable trend toward data types whose nonregular indexing can capture invariants and other sophisticated properties that can be used for program verification and other applications. A simple example of such a type is given by Bird and Meertens' [4] prototypical nested type

```
data PTree (A : Set) : Set where
      pleaf : A → PTree A
      pnode : PTree (A × A) → PTree A
```
of perfect trees, which can be thought of as constraining lists to have lengths that are powers of 2. The above code makes clear that perfect trees at index type A are defined in terms of perfect trees at index type A × A. This is typical of nested types, one type instance of which can depend on others, so that the entire family

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 324–343, 2021. https://doi.org/10.1007/978-3-030-71995-1 17

of types must actually be defined at once. A nested type thus defines not a family of inductive types, but rather an *inductive family of types*. Nested types include simple nested types, like perfect trees, none of whose recursive occurrences occur below another type constructor; "deep" nested types [18], such as the nested type

$$\begin{array}{rcl} \mathsf{data}\,\mathsf{P}\mathsf{F}\mathsf{reset}\,\mathsf{t}\,\mathsf{A}:\mathsf{Set}\,\mathsf{t} & \mathsf{Set}\,\mathsf{where} \\ \mathsf{f}\mathsf{empty}:\mathsf{P}\mathsf{F}\mathsf{reset}\,\mathsf{A} \\ \mathsf{fnode}:\mathsf{A}\to\mathsf{P}\mathsf{T}\mathsf{tree}\,\mathsf{(P}\mathsf{F}\mathsf{reset}\,\mathsf{A}) \to \mathsf{P}\mathsf{F}\mathsf{reset}\,\mathsf{A} \end{array}$$

of perfect forests, whose recursive occurrences appear below type constructors for other nested types; and truly nested types, such as the nested type

$$\begin{array}{rcl} \mathsf{data}\,\mathsf{Bush}\,(\mathsf{A}:\mathsf{Set}) & : & \mathsf{Set}\,\mathsf{where} \\ & \qquad \mathsf{bnil}\,\mathsf{I} & : \,\mathsf{Bush}\,\mathsf{A} \\ & \qquad \mathsf{bcons} & : \,\mathsf{A} \to \mathsf{Bush}\,(\mathsf{Bush}\,\mathsf{A}) \to \mathsf{Bush}\,\mathsf{A} \end{array}$$

of bushes, whose recursive occurrences appear below their own type constructors.

*Relational parametricity* encodes a powerful notion of type-uniformity, or representation independence, for data types in polymorphic languages. It formalizes the intuition that a polymorphic program must act uniformly on all of its possible type instantiations by requiring that every such program preserves all relations between pairs of types at which it is instantiated. Parametricity was originally put forth by Reynolds [24] for System F [11], the calculus at the core of all polymorphic functional languages. It was later popularized as Wadler's "theorems for free" [27], so called because it can deduce properties of programs in such languages solely from their types, i.e., with no knowledge whatsoever of the text of the programs involved. Most of Wadler's free theorems are consequences of naturality for polymorphic list-processing functions. However, parametricity can also derive results that go beyond just naturality, such as correctness for ADTs of the program optimization known as *short cut fusion* [10,14].

But what about nested types? Does parametricity still hold if such types are added to polymorphic calculi? More practically, can we justifiably reason type-independently about (functions over) nested types in functional languages?

Type-independent reasoning about ADTs in functional languages is usually justified by first representing ADTs by their Church encodings, and then reasoning type-independently about these encodings. This is typically justified by constructing a parametric model — i.e, a model in which polymorphic functions preserve relations *´a la* Reynolds — for a suitable fragment of System F, demonstrating that an initial algebra exists for the positive type constructor corresponding to the functor underlying an ADT of interest, and showing that each such initial algebra is suitably isomorphic to its corresponding Church encoding. In fact, this isomorphism of initial algebras and their Church encodings is one of the "litmus tests" for the goodness of a parametric model.

This approach works well for ADTs, which are always fixpoints of *first-order* functors, and whose Church encodings, which involve quantification over only type variables, are always expressible in System F. For example, List A is the fixpoint of the first-order functor F X = 1+ A × X and has Church encoding ∀α. α → (A → α → α) → α. But despite Cardelli's [7] claim that "virtually any basic type of interest can be encoded within F2" — i.e., within System F — non-ADT nested types cannot. Not even our prototypical nested type of perfect trees has a Church encoding expressible in System F! Indeed, PTree A cannot be represented as the fixpoint of any *first-order* functor. However, it can be seen as the instance at index A of the fixpoint of the *higher-order* functor HFA = (A → F A) → (F (A × A) → F A) → F A. It thus has Church encoding ∀f.(∀α. α → fα) → (∀α. f(α × α) → fα) → ∀α. fα, which requires quantification at the higher kind ∗→∗ for f. A similar situation obtains for any (non-ADT) nested type. Unfortunately, higher-kinded quantification is not available in System F, so if we want to reason type-independently about nested types in a language based on it we have only two options: *i*) move to an extension of System F, such as the higher-kinded calculus F<sup>ω</sup> or a dependent type theory, and reason via their Church encodings in a known parametric model for that extension, or *ii*) add nested types to System F as primitives — i.e., as primitive type-level fixpoints — and construct a parametric model for the result.

Since the type systems of F<sup>ω</sup> and dependent type theories are designed to extend System F with far more than non-ADT data types, it seems like serious overkill to pass to their parametric models to reason about nested types in System F. Indeed, such calculi support fundamentally new features that add complexity to their models that is entirely unnecessary for reasoning about nested types. This paper therefore pursues the second option above. We first design a Hindley-Milner-style calculus supporting primitive nested types, together with primitive types of natural transformations representing morphisms between them. Our calculus can express all nested types appearing in the literature, including truly nested types. At the term-level, it supports primitive pattern matching, map functions, and fold combinators for nested types.<sup>1</sup> Our main contribution is the construction of a parametric model for our calculus. This is both delicate and challenging. To ensure the existence of semantic fixpoints interpreting nested types, and thus to establish a suitable Identity Extension Lemma, our type system must explicitly track functoriality of types, and cocontinuity conditions on the functors interpreting them must be appropriately threaded throughout the model construction. Our model validates all standard consequences of parametricity in the presence of primitive nested types, including the isomorphism of primitive ADTs and their Church encodings, and correctness of short cut fusion for nested types. The relationship between naturality and parametricity has long been of interest, and our inclusion of a primitive type of natural transformations allows us to clearly delineate those consequences of parametricity that follow from naturality, from those, such as short cut fusion for nested types, that require the full power of parametricity.

<sup>1</sup> We leave incorporating general term-level recursion to future work because, as Pitts [23] reminds us, "it is hard to construct models of both impredicative polymorphism and fixpoint recursion". In fact, as the development in this paper shows, constructing a parametric model even for our predicative calculus with primitive nested types — and even without term-level fixpoints — is already rather involved. On the other hand, our calculus is strongly normalizing, so it perhaps edges us toward the kind of provably total practical programming language proposed in [27].

**Structure of this Paper** We introduce our calculus in Section 2. Its type system is based on the level-2-truncation of the higher-kinded grammar from [17], augmented with a primitive type of natural transformations. (Since [17] contains no term calculus, the issue of parametricity could not even be raised there.) In Section 3 we give set and relational interpretations of our types. Set interpretations are possible precisely because our calculus is predicative — as ensured by our primitive natural transformation types — and [17] guarantees that local finite presentability of Set makes it suitable for interpreting nested types. As is standard in categorical models, types are interpreted as functors from environments interpreting their type variable contexts to sets or relations, as appropriate. To ensure that these functors satisfy the cocontinuity properties needed for the semantic fixpoints interpreting nested types to exist, set environments must map k-ary type constructor variables to appropriately cocontinuous k-ary functors on sets, relation environments must map k-ary type constructor variables to appropriately cocontinuous k-ary relation transformers, and these cocontinuity conditions must be threaded through our type interpretations in such a way that an Identity Extension Lemma (Theorem 1) can be proved. Properly propagating the cocontinuity conditions requires considerable care, and Section 4, where it is done, is (apart from tracking functoriality in the calculus so that it is actually possible) where the bulk of the work in constructing our model lies.

In Section 5, we give set and relational interpretations for the terms of our calculus. As usual in categorical models, terms are interpreted as natural transformations from interpretations of their term contexts to interpretations of their types, and these must cohere in what is essentially a fibred way. In Section 6.1 we prove a scheme deriving free theorems that are consequences of naturality of polymorphic functions over nested types. This scheme is very general, and is parameterized over both the data type and the type of the polymorphic function at hand. It has, for example, analogues for nested types of Wadler's map-rearrangement free theorems as instances. In Section 6.2 we prove that our model satisfies an Abstraction Theorem (Theorem 4), which we use to derive other parametricity results that go beyond naturality. We conclude in Section 7.

**Related Work** There is a long line of work on categorical models of parametricity for System F; see, e.g., [3,6,8,9,12,13,20,26]. To our knowledge, all such models treat ADTs via their Church encodings, verifying in the just-constructed parametric model that each ADT is isomorphic to its encoding. This paper draws on this rich tradition of categorical models of parametricity for System F, but modifies them to treat nested types (and thus ADTs) as primitive data types. The only other extensions we know of System F with primitive data types are those in [19,21,22,23,27]. Wadler [27] treats full System F, and sketches parametricity for its extension with lists. Martin and Gibbons [21] outline a semantics for a grammar of primitive nested types similar to that in [17], but treat only polynomial nested types. Unfortunately, the model suggested in [21] is not entirely correct (see [17]), and parametricity is nowhere mentioned. Matthes [19] treats System F with non-polynomial ADTs and nested types, but focuses on expressivity of generalized Mendler iteration for them. He gives no semantics.

In [23], Pitts adds list ADTs to full System F with a term-level fixpoint primitive. Other ADTs are included in [22], but nested types are not expressible in either syntax. Pitts constructs parametric models for his calculi based on operational, rather than categorical, semantics. A benefit of using operational semantics to build parametric models is that it avoids needing to work in a suitable metatheory to accommodate System F's impredicativity. It is well-known that there are no set-based parametric models of System F [25], so parametric models for it and its extensions are often constructed in a syntactic metatheory such as the impredicative Calculus of Inductive Constructions (iCIC). By adding primitive nested types to a Hindley-Milner-style calculus and working in a categorical setting we side-step such metatheoretic distractions. It is important to note that different consequences of parametricity are available in syntactic and semantic metatheories. Consequences of parametricity are possible for both closed and open System F terms in a syntactic metatheory — although not all that can be formulated can be always proved; see, e.g., the end of Section 7 of [4]. By contrast, in a categorical metatheory consequences of parametricity are expressible only for *closed* terms. For this reason, validating the standard consequences of parametricity for closed terms is — going all the way back to Reynolds [24] all that is required for a model of parametricity to be considered good.

Atkey [2] treats parametricity for arbitrary higher kinds, constructing a parametric model for System F<sup>ω</sup> within iCIC, rather than in a semantic category. His construction is in some ways similar to ours, but he represents (now higherkinded) data types using Church encodings rather than as primitives. Moreover, the *fmap* functions associated to Atkey's functors must be *given*, presumably by the programmer, together with their underlying type constructors. This absolves him of imposing cocontinuity conditions on his model to ensure that fixpoints of his functors exist, but, unfortunately, he does not indicate which type constructors support *fmap* functions. We suspect explicitly spelling out which types can be interpreted as strictly positive functors would result in a full higher-kinded extension of a calculus akin to that presented here.

### **2 The Calculus**

#### **2.1 Types**

For each <sup>k</sup> <sup>≥</sup> 0, we assume countable sets **<sup>T</sup>**<sup>k</sup> of *type constructor variables of arity* k (i.e., of kind ∗ → ... →∗→∗, with k arrows and k + 1 ∗s in this sequence) and **F**<sup>k</sup> of *functorial variables of arity* k, all mutually disjoint. The sets of all type constructor variables and functorial variables are **T** = - <sup>k</sup>≥<sup>0</sup> **<sup>T</sup>**<sup>k</sup> and **<sup>F</sup>** <sup>=</sup> - <sup>k</sup>≥<sup>0</sup> **<sup>F</sup>**<sup>k</sup>, respectively, and a *type variable* is any element of **T**∪**F**. We use lower case Greek letters for type variables, writing <sup>φ</sup><sup>k</sup> to indicate that <sup>φ</sup> <sup>∈</sup> **<sup>T</sup>**<sup>k</sup> <sup>∪</sup> **<sup>F</sup>**<sup>k</sup>, and omitting the arity indicator k when convenient. Letters from the beginning of the alphabet denote type variables of arity 0, i.e., elements of **<sup>T</sup>**<sup>0</sup> <sup>∪</sup>**F**<sup>0</sup>. We write <sup>φ</sup> for either a set {φ1, ..., φ<sup>n</sup>} of type constructor variables or a set of functorial variables when the cardinality n of the set is unimportant or clear from context. If V is a set of type variables we write V, φ for V ∪ φ when V ∩ φ = ∅. We omit the vector notation for a singleton set, thus writing φ, instead of φ, for {φ}.

If Γ is a finite subset of **T**, Φ is a finite subset of **F**, α is a finite subset of **F**<sup>0</sup> disjoint from <sup>Φ</sup>, and <sup>φ</sup><sup>k</sup> <sup>∈</sup> **<sup>F</sup>**<sup>k</sup> \ <sup>Φ</sup>, then the set <sup>F</sup> of well-formed types is given in Definition 1. The notation there entails that type application φF1...F<sup>k</sup> is allowed only when φ is a type variable of arity k, or φ is a subexpression of the form μψ<sup>k</sup>.λα1...αk.F . Moreover, if φ has arity k then φ must be applied to exactly k arguments. Accordingly, an overbar indicates a sequence of subexpressions whose length matches the arity of the type applied to it. Requiring that types are always in such η*-long normal form* avoids having to consider β-conversion of types. In a subexpression Nat<sup>α</sup>F G, the Nat operator binds all occurrences of the variables in α in F and G; intuitively, Nat<sup>α</sup>F G represents the type of a natural transformation in α from the functor F to the functor G. In a subexpression μφ<sup>k</sup>.λα.F, the μ operator binds all occurrences of the variable φ, and the λ operator binds all occurrences of the variables in α, in the body F.

A *type constructor*, or *non-functorial*, *context* is a finite set Γ of type constructor variables, and a *functorial context* is a finite set Φ of functorial variables. In Definition 1, a judgment of the form Γ; Φ F indicates that the type F is intended to be functorial in the variables in Φ but not necessarily in those in Γ.

**Definition 1.** *The formation rules for the set* F *of* (well-formed) types *are*

$$\begin{array}{c} \begin{array}{c} \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma; \cline{2-4} \hline \Gamma. & \end{array}$$

We write F for ∅; ∅ F. Definition 1 ensures that the expected weakening rules for well-formed types hold (but weakening does not change the contexts in which types can be formed). If Γ; ∅ F and Γ; ∅ G, then our rules allow formation of Γ; ∅ Nat∅F G, which represents the arrow type Γ F → G in our calculus. The type <sup>Γ</sup>; ∅ Nat<sup>α</sup> **<sup>1</sup>** <sup>F</sup> represents the <sup>∀</sup>-type <sup>Γ</sup>; ∅∀α.F. Some System F types, such as ∀α.(α → α) → α, are not representable in our calculus.

Since the body F of a type (μφ.λα.F)G can only be functorial in φ and the variables in α, the representation of *List* α as the ADT μβ. **1** + α × β cannot be functorial in α. By contrast, if *List* α is represented as the nested type (μφ.λβ. **1**+ β×φβ) α then we can choose α to be a functorial variable or not when forming the type. This observation holds for other ADTs as well; for example, if *Tree* α γ = μβ.α + β × γ × β, then α, γ; ∅ *Tree* α γ is well-formed, but ∅; α, γ *Tree* α γ is not. It also applies to some non-ADT types, such as *GRose* φ α = μβ.**1**+α×φβ, in which φ and α must both be non-functorial variables. It is in fact possible to allow "extra" 0-ary functorial variables in the body of μ-types (functorial variables of higher arity are the real problem). This would allow the first-order representations of ADTs to be functorial, but doing so requires some changes to the formation rule for μ-types, as well as the delicate threading of some additional conditions throughout our model construction. But since we can always use an ADT's (semantically equivalent) second-order representation when functoriality is needed, disallowing such "extra" variables does not negatively impact the expressivity of our calculus. We therefore pursue the simpler syntax here.

Definition 1 allows well-formed types to be functorial in no variables. Functorial variables can also be demoted to non-functorial status: if F[φ :== ψ] is the textual replacement of <sup>φ</sup> in <sup>F</sup>, then Γ, ψ<sup>k</sup>; <sup>Φ</sup> <sup>F</sup>[φ<sup>k</sup> :== <sup>ψ</sup><sup>k</sup>] is derivable whenever <sup>Γ</sup>; Φ, φ<sup>k</sup> <sup>F</sup> is. In addition to textual replacement, we also have substitution for types. If Γ; Φ F is a type, if Γ and Φ contain only type variables of arity 0, and if k = 0 for every occurrence of φ<sup>k</sup> bound by μ in F, then we say that F is *first-order*; otherwise we say that F is *second-order*. Substitution for first-order types is the usual capture-avoiding textual substitution. We write F[α := σ] for the result of substituting σ for α in F, and F[α<sup>1</sup> := F1, ..., α<sup>k</sup> := Fk], or F[α := F] when convenient, for F[α<sup>1</sup> := F1][α<sup>2</sup> := F2, ..., α<sup>k</sup> := Fk]. The operation (·)[φ :=<sup>α</sup> F] of *second-order type substitution along* α is defined by induction on types exactly as expected. The only interesting clause is that for type application, which defines (ψG)[φ :=<sup>α</sup> F] to be F[α := G[φ :=<sup>α</sup> F]] if ψ = φ and <sup>G</sup>[<sup>φ</sup> :=<sup>α</sup> <sup>F</sup>] otherwise. Of course, (·)[φ<sup>0</sup> :=<sup>∅</sup> <sup>F</sup>] coincides with first-order substitution. We omit α when convenient, but note that it is not correct to substitute along non-functorial variables. It is not hard to see that if <sup>Γ</sup>; Φ, φ<sup>k</sup> <sup>H</sup> and <sup>Γ</sup>; Φ, <sup>α</sup> <sup>F</sup> with <sup>|</sup>α<sup>|</sup> <sup>=</sup> <sup>k</sup>, then <sup>Γ</sup>; <sup>Φ</sup> <sup>H</sup>[<sup>φ</sup> :=<sup>α</sup> <sup>F</sup>]. Similarly, if Γ, φ<sup>k</sup>; <sup>Φ</sup> <sup>H</sup>, and if <sup>Γ</sup>; ψ, <sup>α</sup> <sup>F</sup> with <sup>|</sup>α<sup>|</sup> <sup>=</sup> <sup>k</sup> and <sup>Φ</sup>∩<sup>ψ</sup> <sup>=</sup> <sup>∅</sup>, then Γ, <sup>ψ</sup> ; Φ H[φ :=<sup>α</sup> F[ψ :== ψ ]].

#### **2.2 Terms**

Assume an infinite set V of term variables disjoint from **T** and **F**. If Γ is a type constructor context and Φ is a functorial context, then a *term context for* Γ *and* Φ is a finite set of bindings of the form x : F, where x ∈ V and Γ; Φ F. We adopt the above conventions for disjoint unions and vectors in term contexts. If Δ is a term context for Γ and Φ then the formation rules for the set of *well-formed terms over* Δ are given in Figure 1. An expression Lαx.t binds all occurrences of the type variables in α in the types of x and t, as well as all occurrences of x in t. In the rule for tKs there is one functorial expression in K for every variable in α. In the rule for mapF ,G <sup>H</sup> there is one functorial expression in F and one functorial expression in G for each variable in φ. Moreover, for each φ<sup>k</sup> in φ the number of variables in β in the judgments for functorial expresssions in F and G is k. In the rules for in<sup>H</sup> and fold<sup>F</sup> <sup>H</sup>, the variables in β are fresh with respect to H, and there is one β for every α. Substitution for terms is the obvious extension of the usual capture-avoiding textual substitution, and weakening is respected.

The "extra" functorial variables in γ in the rules for mapF ,G <sup>H</sup> (i.e., those variables not affected by the substitution of φ) allow us to map polymorphic functions over nested types. Suppose, for example, that we want to map the polymorphic function *flatten* : Nat<sup>β</sup>(*PTree* β) (*List* β) over lists. The map term for this is typeable as follows:

$$\begin{array}{c} \Gamma; \alpha, \gamma \vdash List \, \alpha & \Gamma; \gamma \vdash P T \text{re} \, \gamma & \Gamma; \gamma \vdash List \, \gamma \\\hline \hline \Gamma; \emptyset \mid \emptyset \vdash \mathsf{map}\_{List \, \alpha}^{P T \text{re} \, \gamma, List \, \gamma} : \mathsf{Nat}^{\emptyset}(\mathsf{Nat}^{\gamma}(P T \text{re} \, \gamma) \, (List \, \gamma)) \, (\mathsf{Nat}^{\gamma} \, (List \, (P T \text{re} \, \gamma)) \, (List \, (List \, \gamma))) \end{array}$$


However, this derivation would not possible without the "extra" variable γ.

Our calculus is expressive enough to define, e.g., a function *reversePTree* : Nat<sup>α</sup> (*PTree* α)(*PTree* α) that reverses the order of the leaves in a perfect tree. It maps the perfect tree ((1, 2),(3, 4)) to ((4, 3),(2, 1)). Unfortunately, we cannot define recursive functions — such as a concatenation function for perfect trees or a zip function for bushes — that take as inputs a nested type and an argument of another type, both of which are parameterized over the same variable. The fundamental issue is that recursion is expressible only via fold, which produces natural transformations in some variables α from μ-types to other functors F. The restrictions on Nat-types entail that F cannot itself be a Nattype containing <sup>α</sup>, so, e.g., Nat<sup>α</sup> (*PTree* <sup>α</sup>)(Nat<sup>∅</sup> (*PTree* <sup>α</sup>)(*PTree* (α×α))) is not well-typed. Uncurrying gives Nat<sup>α</sup> (*PTree* <sup>α</sup> <sup>×</sup> *PTree* <sup>α</sup>)(*PTree* (<sup>α</sup> <sup>×</sup> <sup>α</sup>)), which is well-typed, but fold cannot produce a term of this type because *PTree* α×*PTree* α is not a μ-type. Our calculus can, however, express types of recursive functions that take multiple nested types as arguments, provided they are parameterized over disjoint sets of type variables and the return type of the function is parameterized over only the variables occurring in the type of its final argument. Even for ADTs there is a difference between which folds over them we can type when they are viewed as ADTs (i.e., as fixpoints of first-order functors) versus as proper nested types (i.e., as fixpoints of higher-order functors). This is because, in the return type of fold, the arguments of the μ-type must be variables bound by Nat. For ADTs, the μ-type takes no arguments, making it possible to write recursive functions, such as a concatenation function for lists of type α; ∅ Nat<sup>∅</sup> (μβ.**1**+α×β) (Nat∅(μβ.**1**+α×β) (μβ.**1**+α×β)). This is not possible for nested types — even when they are semantically equivalent to ADTs.

Interestingly, even some recursive functions of a single proper nested type e.g., a reverse function for bushes that is a true involution — cannot be expressed as folds because the algebra arguments needed to define them are again recursive functions with types of the same problematic form as the type of, e.g., a zip function for perfect trees. Expressivity of folds for nested types has long been a vexing issue, and this is naturally inherited by our calculus. Adding more expressive recursion combinators — e.g., generalized folds or Mendler iterators — could help, but since this is orthogonal to the issue of parametricity in the presence of primitive nested types we do not consider it further here.

### **3 Interpreting Types**

We denote the category of sets and functions by Set. The category Rel has as objects triples (A, B, R), where R is a relation between sets A and B. It has as morphisms from (A, B, R) to (A , B , R ) pairs (f : A → A , g : B → B ) of morphisms in Set such that (f a, g b) ∈ R if (a, b) ∈ R. We may write R : Rel(A, B) for (A, B, R). If R : Rel(A, B) we write π1R and π2R for the *domain* A of R and the *codomain* B of R, respectively, and assume π<sup>1</sup> and π<sup>2</sup> are surjective. We write Eq<sup>A</sup> = (A, A, {(x, x) | x ∈ A}) for the *equality relation* on the set A.

The key idea underlying Reynolds' parametricity is to give each type F(α) with one free variable α a *set interpretation* F<sup>0</sup> taking sets to sets and a *relational interpretation* F<sup>1</sup> taking relations R : Rel(A, B) to relations F1(R) : Rel(F0(A), F0(B)), and to interpret each term t(α, x) : F(α) with one free term variable x : G(α) as a map t<sup>0</sup> associating to each set A a function t0(A) : G0(A) → F0(A). These interpretations are given inductively on the structures of F and t in such a way that they imply two fundamental theorems. The first is an *Identity Extension Lemma*, which states that F1(EqA) = Eq<sup>F</sup>0(A), and is the essential property that makes a model relationally parametric rather than just induced by a logical relation. The second is an *Abstraction Theorem*, which states that, for any R : Rel(A, B), (t0(A), t0(B)) is a morphism in Rel from (G0(A), G0(B), G1(R)) to (F0(A), F0(B), F1(R)). The Identity Extension Lemma is similar to the Abstraction Theorem except that it holds for *all* elements of a type's interpretation, not just those that interpret terms. Similar theorems are required for types and terms with any number of free variables.

The key to proving our Identity Extension Lemma is a familiar "cutting down" of the interpretations of universally quantified types to include only the "parametric" elements; the relevant types here are Nat types. This requires that the set interpretations of types (Section 3.1) are defined simultaneously with their relational interpretations (Section 3.2). While set interpretations are relatively straightforward, relational interpretations are less so because of the cocontinuity conditions needed to know they are well-defined. We develop these conditions in Sections 3.1 and 3.2. This separates our set and relational interpretations in space, but has no other impact on the mutually inductive definitions.

Γ; <sup>Φ</sup> **<sup>0</sup>** <sup>S</sup>etρ = 0 Γ; <sup>Φ</sup> **<sup>1</sup>** <sup>S</sup>etρ = 1 Γ; <sup>∅</sup> Nat<sup>α</sup> <sup>F</sup> <sup>G</sup> <sup>S</sup>et<sup>ρ</sup> <sup>=</sup> {<sup>η</sup> : λA. Γ; <sup>α</sup> <sup>F</sup> <sup>S</sup>etρ[<sup>α</sup> := <sup>A</sup>] <sup>⇒</sup> λA. Γ; <sup>α</sup> <sup>G</sup> <sup>S</sup>etρ[α := A] | ∀A, B : Set.∀R : Rel(A, B). (ηA, <sup>η</sup>B) : Γ; <sup>α</sup> <sup>F</sup> Rel Eqρ[<sup>α</sup> := <sup>R</sup>] <sup>→</sup> Γ; <sup>α</sup> <sup>G</sup> Rel Eqρ[<sup>α</sup> := <sup>R</sup>]} Γ; <sup>Φ</sup> φF <sup>S</sup>et<sup>ρ</sup> <sup>=</sup> (ρφ) Γ; <sup>Φ</sup> <sup>F</sup><sup>S</sup>et<sup>ρ</sup> Γ; <sup>Φ</sup> <sup>F</sup> <sup>+</sup> <sup>G</sup> <sup>S</sup>et<sup>ρ</sup> <sup>=</sup> Γ; <sup>Φ</sup> <sup>F</sup> <sup>S</sup>et<sup>ρ</sup> <sup>+</sup> Γ; <sup>Φ</sup> <sup>G</sup> <sup>S</sup>etρ Γ; <sup>Φ</sup> <sup>F</sup> <sup>×</sup> <sup>G</sup> <sup>S</sup>et<sup>ρ</sup> <sup>=</sup> Γ; <sup>Φ</sup> <sup>F</sup> <sup>S</sup>et<sup>ρ</sup> <sup>×</sup> Γ; <sup>Φ</sup> <sup>G</sup> <sup>S</sup>etρ Γ; <sup>Φ</sup> (μφ.λα.H)G <sup>S</sup>etρ = (μT<sup>S</sup>et H,ρ)Γ; <sup>Φ</sup> <sup>G</sup><sup>S</sup>et<sup>ρ</sup> where T<sup>S</sup>et H,ρ <sup>F</sup> <sup>=</sup> λA.Γ; φ, <sup>α</sup> <sup>H</sup> <sup>S</sup>etρ[φ := F][α := A] and T<sup>S</sup>et H,ρ <sup>η</sup> <sup>=</sup> λA.Γ; φ, <sup>α</sup> <sup>H</sup> <sup>S</sup>etidρ[<sup>φ</sup> := <sup>η</sup>][<sup>α</sup> := id A] **Fig. 2.** Set interpretation

#### **3.1 Interpreting Types as Sets**

We interpret types in our calculus as ω-cocontinuous functors on locally finitely presentable categories [1]. Since functor categories of locally finitely presentable categories are again locally finitely presentable, this ensures that the fixpoints interpreting μ-types in Set and Rel exist, and thus that both the set and relational interpretations of all of the types in Definition 1 are well-defined [17]. To bootstrap this process, we interpret type variables as ω-cocontinuous functors. If C and D are locally finitely presentable categories, we write [C, D] for the category of ω-cocontinuous functors from C to D.

<sup>A</sup> *set environment* maps each type variable in **<sup>T</sup>**<sup>k</sup> <sup>∪</sup> **<sup>F</sup>**<sup>k</sup> to an element of [Set<sup>k</sup>, Set]. A morphism <sup>f</sup> : <sup>ρ</sup> <sup>→</sup> <sup>ρ</sup> for set environments <sup>ρ</sup> and <sup>ρ</sup> with <sup>ρ</sup>|**<sup>T</sup>** <sup>=</sup> <sup>ρ</sup> |**T** maps each type constructor variable <sup>ψ</sup><sup>k</sup> <sup>∈</sup> **<sup>T</sup>** to the identity natural transformation on ρψ<sup>k</sup> = ρ <sup>ψ</sup><sup>k</sup> and each functorial variable <sup>φ</sup><sup>k</sup> <sup>∈</sup> **<sup>F</sup>** to a natural transformation from the k-ary functor ρφ<sup>k</sup> on Set to the k-ary functor ρ φ<sup>k</sup> on Set. Composition of morphisms on set environments is componentwise, with the identity morphism mapping each one to itself. This gives a category of set environments and morphisms between them, denoted SetEnv. We identify a functor in [Set<sup>0</sup>, Set] with its value on <sup>∗</sup>, and consider a set environment to map a type variable of arity 0 to a set. If α = {α1, ..., α<sup>k</sup>} and A = {A1, ..., A<sup>k</sup>}, then we write ρ[α := A] for the set environment ρ such that ρ α<sup>i</sup> = A<sup>i</sup> for i = 1, ..., k and ρ α = ρα if α ∈ {α1, ..., α<sup>k</sup>}. If ρ ∈ SetEnv we write Eq<sup>ρ</sup> for the relation environment (see Section 3) such that Eqρv = Eqρv for every type variable v. The *set interpretation* -·<sup>S</sup>et : F → [SetEnv, Set] is defined in Figure 2. The relational interpretations in the second clause of Figure 2 are given in full in Figure 3.

If <sup>ρ</sup> <sup>∈</sup> SetEnv and <sup>F</sup> we write - <sup>F</sup><sup>S</sup>et for - <sup>F</sup><sup>S</sup>et<sup>ρ</sup> since the environment is immaterial. The third clause of Figure 2 does indeed define a set: local finite presentability of Set and <sup>ω</sup>-cocontinuity of -<sup>Γ</sup>; <sup>α</sup> <sup>F</sup><sup>S</sup>et<sup>ρ</sup> ensure that the set of natural transformations {<sup>η</sup> : -<sup>Γ</sup>; <sup>α</sup> <sup>F</sup><sup>S</sup>et<sup>ρ</sup> <sup>⇒</sup> -<sup>Γ</sup>; <sup>α</sup> <sup>G</sup><sup>S</sup>etρ} (which contains -<sup>Γ</sup>; ∅ Nat<sup>α</sup> F G<sup>S</sup>etρ) is a subset of (-<sup>Γ</sup>; <sup>α</sup> <sup>G</sup><sup>S</sup>etρ[<sup>α</sup> := <sup>S</sup>])(-<sup>Γ</sup>;α<sup>F</sup> Setρ[α:=S]) <sup>S</sup> = (S1, ..., S|α|), and <sup>S</sup><sup>i</sup> is a finite set for <sup>i</sup> = 1, ..., <sup>|</sup>α<sup>|</sup> . There are countably many tuples <sup>S</sup>, each giving a morphism from -<sup>Γ</sup>; <sup>α</sup> <sup>F</sup><sup>S</sup>etρ[<sup>α</sup> := <sup>S</sup>] to -<sup>Γ</sup>; <sup>α</sup> <sup>G</sup><sup>S</sup>etρ[<sup>α</sup> := <sup>S</sup>], and only Set-many such morphisms since Set is locally small. In addition, -<sup>Γ</sup>; ∅ Nat<sup>α</sup>F G<sup>S</sup>et is <sup>ω</sup>-cocontinuous since it is constant on <sup>ω</sup>-directed sets. Interpretations of Nat types ensure that -<sup>Γ</sup> <sup>F</sup> <sup>→</sup> <sup>G</sup><sup>S</sup>et and <sup>Γ</sup> ∀α.F<sup>S</sup>et are as expected in parametric models.


#### **Definition 2.** *The action of* -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>S</sup>et *on* <sup>f</sup> : <sup>ρ</sup> <sup>→</sup> <sup>ρ</sup> *in* SetEnv *is given by:*

**–** -<sup>Γ</sup>; <sup>Φ</sup> **<sup>0</sup>**<sup>S</sup>et<sup>f</sup> <sup>=</sup> *id* <sup>0</sup> **–** -<sup>Γ</sup>; <sup>Φ</sup> **<sup>1</sup>**<sup>S</sup>et<sup>f</sup> <sup>=</sup> *id* <sup>1</sup> **–** -<sup>Γ</sup>; ∅ Nat<sup>α</sup> F G<sup>S</sup>et<sup>f</sup> <sup>=</sup> *id*-<sup>Γ</sup>;∅Nat<sup>α</sup> F GSet<sup>ρ</sup> **–** -<sup>Γ</sup>; <sup>Φ</sup> φF<sup>S</sup>et<sup>f</sup> : -<sup>Γ</sup>; <sup>Φ</sup> φF<sup>S</sup>et<sup>ρ</sup> <sup>→</sup> -<sup>Γ</sup>; <sup>Φ</sup> φF<sup>S</sup>etρ = (ρφ)-<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>S</sup>et<sup>ρ</sup> → (ρ φ)-<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>S</sup>etρ *is defined by* -<sup>Γ</sup>; <sup>Φ</sup> φF<sup>S</sup>et<sup>f</sup> = (fφ)-<sup>Γ</sup>;Φ<sup>F</sup> Setρ- ◦ (ρφ)-<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>S</sup>et<sup>f</sup> = (ρ φ)-<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>S</sup>et<sup>f</sup> ◦ (fφ)-<sup>Γ</sup>;Φ<sup>F</sup> Set<sup>ρ</sup> *. This holds since* ρφ *and* ρ φ *are functors and* fφ : ρφ → ρ φ *is a natural transformation.* **–** -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup> <sup>+</sup> <sup>G</sup><sup>S</sup>et<sup>f</sup> *is defined by* -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup> <sup>+</sup> <sup>G</sup><sup>S</sup>etf(inL <sup>x</sup>) = inL (-<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>S</sup>etfx) *and* -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup> <sup>+</sup> <sup>G</sup><sup>S</sup>etf(inR <sup>y</sup>) = inR (-<sup>Γ</sup>; <sup>Φ</sup> <sup>G</sup><sup>S</sup>etfy) **–** -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup> <sup>×</sup> <sup>G</sup><sup>S</sup>et<sup>f</sup> <sup>=</sup> -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>S</sup>et<sup>f</sup> <sup>×</sup> -<sup>Γ</sup>; <sup>Φ</sup> <sup>G</sup><sup>S</sup>et<sup>f</sup> **–** -<sup>Γ</sup>; <sup>Φ</sup> (μφ.λα.H)G<sup>S</sup>et<sup>f</sup> : -<sup>Γ</sup>; <sup>Φ</sup> (μφ.λα.H)G<sup>S</sup>et<sup>ρ</sup> <sup>→</sup> -<sup>Γ</sup>; <sup>Φ</sup> (μφ.λα.H)G<sup>S</sup>etρ =(μT <sup>S</sup>et H,ρ)-<sup>Γ</sup>; <sup>Φ</sup> <sup>G</sup><sup>S</sup>et<sup>ρ</sup> <sup>→</sup> (μT <sup>S</sup>et H,ρ- )-<sup>Γ</sup>; <sup>Φ</sup> <sup>G</sup><sup>S</sup>etρ *is defined by* (μT <sup>S</sup>et H,f )-<sup>Γ</sup>; <sup>Φ</sup> <sup>G</sup><sup>S</sup>etρ ◦ (μT <sup>S</sup>et H,ρ)-<sup>Γ</sup>; <sup>Φ</sup> <sup>G</sup><sup>S</sup>et<sup>f</sup> <sup>=</sup> (μT <sup>S</sup>et H,ρ *and*

$$\begin{array}{l} \mu \left( \mu T\_{H,\rho'}^{\mathsf{Set}} \right) \overline{\left\| T; \Phi \vdash G \right\|}^{\mathsf{Set}} f \circ \left( \mu T\_{H,f}^{\mathsf{Set}} \right) \overline{\left\| T; \Phi \vdash G \right\|}^{\mathsf{Set}} \rho. \text{ This holds since } \mu T\_{H,\rho}^{\mathsf{Set}} \text{ and } \mu \mu T\_{H,f}^{\mathsf{Set}}: \mu T\_{H,\rho}^{\mathsf{Set}}: \mu T\_{H,\rho'}^{\mathsf{Set}} \text{ is a natural transformation.}\\ \mu T\_{H,\rho'}^{\mathsf{Set}} \text{ are functors and } \mu T\_{H,f}^{\mathsf{Set}}: \mu T\_{H,\rho}^{\mathsf{Set}} \to \mu T\_{H,\rho'}^{\mathsf{Set}} \text{ is a natural transformation.} \end{array}$$

#### **3.2 Interpreting Types as Relations**

A k*-ary relation transformer* F is a triple (F<sup>1</sup>, F<sup>2</sup>, F∗), where F<sup>1</sup>, F<sup>2</sup> : [Set<sup>k</sup>, Set] and F<sup>∗</sup> : [Rel<sup>k</sup>, Rel] are functors, if R<sup>i</sup> : Rel(Ai, Bi) for i = 1, ..., k then F∗R : Rel(F<sup>1</sup>A, F<sup>2</sup>B), and if (αi, βi) <sup>∈</sup> Hom<sup>R</sup>e<sup>l</sup>(Ri, Si) for <sup>i</sup> = 1, ..., k, then <sup>F</sup>∗(α, β) = (F<sup>1</sup>α, F<sup>2</sup>β). We define FR to be F∗R and F(α, β) to be F∗(α, β). The last clause above expands to: if (a, b) ∈ R implies (α a, β b) ∈ S then (c, d) ∈ F∗R implies (F<sup>1</sup>α c, F<sup>2</sup>β d) <sup>∈</sup> <sup>F</sup>∗S. We identify a 0-ary relation transformer (A, B, R) with R : Rel(A, B), and write π1F for F<sup>1</sup> and π2F for F<sup>2</sup>. Below we extend these conventions to relation environments in the obvious ways.

The category RT<sup>k</sup> of k-ary relation transformers is given by the following data: an object of RT<sup>k</sup> is a k-ary relation transformer; a morphism δ : (G<sup>1</sup>, G<sup>2</sup>, G∗) <sup>→</sup> (H<sup>1</sup>, H<sup>2</sup>, H∗) in RT<sup>k</sup> is a pair of natural transformations (δ<sup>1</sup>, δ<sup>2</sup>) where <sup>δ</sup><sup>1</sup> : <sup>G</sup><sup>1</sup> <sup>→</sup> <sup>H</sup><sup>1</sup>, <sup>δ</sup><sup>2</sup> : <sup>G</sup><sup>2</sup> <sup>→</sup> <sup>H</sup><sup>2</sup> such that, for all <sup>R</sup> : Rel(A, B), if (x, y) <sup>∈</sup> G∗R then (δ<sup>1</sup> Ax, δ<sup>2</sup> <sup>B</sup>y) ∈ H∗R; and identity morphisms and composition are inherited from the category of functors on Set. An endofunctor H on RT<sup>k</sup> is a triple H = (H<sup>1</sup>, H<sup>2</sup>, H∗), where H<sup>1</sup> and H<sup>2</sup> are functors from [Set<sup>k</sup>, Set] to [Set<sup>k</sup>, Set]; H<sup>∗</sup> is a functor from RT<sup>k</sup> to [Rel<sup>k</sup>, Rel]; for all R : Rel(A, B), π1((H∗(δ<sup>1</sup>, δ<sup>2</sup>))R)=(H<sup>1</sup>δ<sup>1</sup>)<sup>A</sup> and π2((H∗(δ<sup>1</sup>, δ<sup>2</sup>))R)=(H<sup>2</sup>δ<sup>2</sup>)B; the action of H on objects is given by H (F<sup>1</sup>, F<sup>2</sup>, F∗)=(H<sup>1</sup>F<sup>1</sup>, H<sup>2</sup>F<sup>2</sup>, H∗(F<sup>1</sup>, F<sup>2</sup>, F∗)); and the action of H on morphisms is given by H (δ<sup>1</sup>, δ<sup>2</sup>)=(H<sup>1</sup>δ<sup>1</sup>, H<sup>2</sup>δ<sup>2</sup>) for (δ<sup>1</sup>, δ<sup>2</sup>) : (F<sup>1</sup>, F<sup>2</sup>, F∗) <sup>→</sup> (G<sup>1</sup>, G<sup>2</sup>, G∗). Since applying an endofunctor <sup>H</sup> to <sup>k</sup>-ary relation transformers and morphisms between them must give k-ary relation transformers and morphisms between them, this definition implicitly requires the following three conditions to hold: *i*) H∗(F<sup>1</sup>, F<sup>2</sup>, F∗)R : Rel(H<sup>1</sup>F<sup>1</sup>A, H<sup>2</sup>F<sup>2</sup>B) if R<sup>1</sup> : Rel(A1, B1), ..., R<sup>k</sup> : Rel(Ak, Bk); *ii*) H∗(F<sup>1</sup>, F<sup>2</sup>, F∗)(α, β)=(H<sup>1</sup>F<sup>1</sup>α, H<sup>2</sup>F<sup>2</sup>β) if (α1, β1) <sup>∈</sup> Hom<sup>R</sup>e<sup>l</sup>(R1, S1), ...,(αk, βk) <sup>∈</sup> Hom<sup>R</sup>e<sup>l</sup>(Rk, Sk); and *iii*) if (δ<sup>1</sup>, δ<sup>2</sup>) : (F<sup>1</sup>, F<sup>2</sup>, F∗) <sup>→</sup> (G<sup>1</sup>, G<sup>2</sup>, G∗) and <sup>R</sup><sup>1</sup> : Rel(A1, B1), ..., R<sup>k</sup> : Rel(Ak, Bk), then ((H<sup>1</sup>δ<sup>1</sup>)Ax,(H<sup>2</sup>δ<sup>2</sup>)By) <sup>∈</sup> <sup>H</sup>∗(G<sup>1</sup>, G<sup>2</sup>, G∗)<sup>R</sup> if (x, y) <sup>∈</sup> <sup>H</sup>∗(F<sup>1</sup>, F<sup>2</sup>, F∗)R. Note, however, that this last condition is automatically satisfied because it is implied by the third condition on functors on relation transformers.

If H and K are endofunctors on RTk, then a *natural transformation* σ : <sup>H</sup> <sup>→</sup> <sup>K</sup> is a pair <sup>σ</sup> = (σ<sup>1</sup>, σ<sup>2</sup>), where <sup>σ</sup><sup>1</sup> : <sup>H</sup><sup>1</sup> <sup>→</sup> <sup>K</sup><sup>1</sup> and <sup>σ</sup><sup>2</sup> : <sup>H</sup><sup>2</sup> <sup>→</sup> <sup>K</sup><sup>2</sup> are natural transformations between endofunctors on [Set<sup>k</sup>, Set] and the component of <sup>σ</sup> at <sup>F</sup> <sup>∈</sup> RT<sup>k</sup> is given by <sup>σ</sup><sup>F</sup> = (σ<sup>1</sup> <sup>F</sup> <sup>1</sup> , σ<sup>2</sup> <sup>F</sup> <sup>2</sup> ). This definition entails that σ<sup>i</sup> <sup>F</sup> <sup>i</sup> is natural in F<sup>i</sup> : [Set<sup>k</sup>, Set], and, for every F, both (σ<sup>1</sup> <sup>F</sup> <sup>1</sup> )<sup>A</sup> and (σ<sup>2</sup> <sup>F</sup> <sup>2</sup> )<sup>A</sup> are natural in A. Moreover, since the results of applying σ to k-ary relation transformers must be morphisms of k-ary relation transformers, it implicitly requires that (σ<sup>F</sup> )<sup>R</sup> = ((σ<sup>1</sup> <sup>F</sup> <sup>1</sup> )A,(σ<sup>2</sup> <sup>F</sup> <sup>2</sup> )B) is a morphism in Rel for any k-tuple of relations <sup>R</sup> : Rel(A, B), i.e., that if (x, y) <sup>∈</sup> <sup>H</sup>∗FR, then ((σ<sup>1</sup> <sup>F</sup> <sup>1</sup> )Ax,(σ<sup>2</sup> <sup>F</sup> <sup>2</sup> )By) ∈ K∗FR.

Critically, we can compute ω-directed colimits in RTk. Indeed, if D is an <sup>ω</sup>-directed set then lim−→<sup>d</sup>∈D(F<sup>1</sup> <sup>d</sup> , F<sup>2</sup> <sup>d</sup> , F<sup>∗</sup> <sup>d</sup> ) = (lim−→<sup>d</sup>∈D F<sup>1</sup> <sup>d</sup> , lim−→<sup>d</sup>∈D F<sup>2</sup> <sup>d</sup> , lim−→<sup>d</sup>∈D F<sup>∗</sup> <sup>d</sup> ). We define an endofunctor T = (T<sup>1</sup>, T<sup>2</sup>, T <sup>∗</sup>) on RT<sup>k</sup> to be ω*-cocontinuous* if T<sup>1</sup> and T<sup>2</sup> are ω-cocontinuous endofunctors on [Set<sup>k</sup>, Set] and T <sup>∗</sup> is an ω-cocontinuous functor from RT<sup>k</sup> to [Rel<sup>k</sup>, Rel], i.e., is in [RTk, [Rel<sup>k</sup>, Rel]]. Now, for any k, any A : Set, and any R : Rel(A, B), let K<sup>S</sup>et <sup>A</sup> be the constantly A-valued functor from Set<sup>k</sup> to Set and K<sup>R</sup>e<sup>l</sup> <sup>R</sup> be the constantly R-valued functor from Rel<sup>k</sup> to Rel. Also let 0 denote the initial object of either Set or Rel, as appropriate. Observing that, for every k, K<sup>S</sup>et <sup>0</sup> is initial in [Set<sup>k</sup>, Set], and <sup>K</sup><sup>R</sup>e<sup>l</sup> <sup>0</sup> is initial in [Rel<sup>k</sup>, Rel], we have that, for each k, K<sup>0</sup> = (K<sup>S</sup>et <sup>0</sup> , K<sup>S</sup>et <sup>0</sup> , K<sup>R</sup>e<sup>l</sup> <sup>0</sup> ) is initial in RTk. Thus, if <sup>T</sup> = (T<sup>1</sup>, T<sup>2</sup>, T <sup>∗</sup>) : RT<sup>k</sup> <sup>→</sup> RT<sup>k</sup> is an endofunctor on RT<sup>k</sup> we can define the relation transformer μT to be lim−→<sup>n</sup>∈**<sup>N</sup>** <sup>T</sup> <sup>n</sup>K<sup>0</sup> = (μT<sup>1</sup>, μT<sup>2</sup>, lim−→<sup>n</sup>∈**<sup>N</sup>**(<sup>T</sup> <sup>n</sup>K0)∗). If T : [RTk, RTk] then μT is a fixpoint for T, i.e., μT ∼= T(μT). The isomorphism is given by (*in*1, *in*2) : <sup>T</sup>(μT) <sup>→</sup> μT and (in−<sup>1</sup> <sup>1</sup> , in−<sup>1</sup> <sup>2</sup> ) : μT → T(μT) in RTk. The latter is always a morphism in RTk, but the former need not be if T is not ω-cocontinuous. Since μT's third component is the colimit in [Rel<sup>k</sup>, Rel] of third components of relation transformers, rather than a fixpoint of an endofunctor on [Rel<sup>k</sup>, Rel], there is an asymmetry between μT's first two and third components.

<sup>A</sup> *relation environment* maps each type variable in **<sup>T</sup>**<sup>k</sup>∪**F**<sup>k</sup> to a <sup>k</sup>-ary relation transformer. A morphism f : ρ → ρ between relation environments ρ and ρ with ρ|**<sup>T</sup>** = ρ <sup>|</sup>**<sup>T</sup>** maps each <sup>ψ</sup><sup>k</sup> <sup>∈</sup> **<sup>T</sup>** to the identity morphism on ρψ<sup>k</sup> <sup>=</sup> <sup>ρ</sup> ψ<sup>k</sup> and each <sup>φ</sup><sup>k</sup> <sup>∈</sup> **<sup>F</sup>** to a morphism from the <sup>k</sup>-ary relation transformer ρφ to the <sup>k</sup>-ary relation transformer ρ φ. Composition of morphisms on relation environments is componentwise, with the identity morphism mapping each to itself; this gives a category RelEnv of relation environments and their morphisms. We identify a 0 ary relation transformer with its codomain, and consider a relation environment to map a type variable of arity 0 to a relation. We write ρ[α := R] for the relation environment ρ such that ρ α<sup>i</sup> = R<sup>i</sup> for i = 1, ..., k and ρ α = ρα if α ∈ {α1, ..., α<sup>k</sup>}. If ρ ∈ RelEnv we write π1ρ and π2ρ for the set environments mapping each type variable φ to the functors (ρφ)<sup>1</sup> and (ρφ)<sup>2</sup>, respectively.

For each k, an ω-cocontinuous functor H : [RelEnv, RTk] is a triple H = (H<sup>1</sup>, H<sup>2</sup>, H∗), where H<sup>1</sup>, H<sup>2</sup> : [SetEnv, [Set<sup>k</sup>, Set]]; H<sup>∗</sup> : [RelEnv, [Rel<sup>k</sup>, Rel]]; for all R : Rel(A, B) and morphisms f in RelEnv, π1(H∗f R) = H<sup>1</sup>(π1f) A and π2(H∗f R) = H<sup>2</sup>(π2f) B; the action of H on ρ in RelEnv is given by Hρ = (H<sup>1</sup>(π1ρ), H<sup>2</sup>(π2ρ), H∗ρ); and the action of <sup>H</sup> on morphisms <sup>f</sup> : <sup>ρ</sup> <sup>→</sup> <sup>ρ</sup> in RelEnv is given by Hf = (H<sup>1</sup>(π1f), H<sup>2</sup>(π2f)). The last two points above give: *i*) if R<sup>i</sup> : Rel(Ai, Bi), i = 1, ..., k, then H∗ρ R : Rel(H<sup>1</sup>(π1ρ) A, H<sup>2</sup>(π2ρ) B); *ii*) if (αi, βi) <sup>∈</sup> Hom<sup>R</sup>e<sup>l</sup>(Ri, Si), <sup>i</sup> = 1, ..., k, then <sup>H</sup>∗<sup>ρ</sup> (α, β)=(H<sup>1</sup>(π1ρ) α, H<sup>2</sup>(π2ρ) <sup>β</sup>); and *iii*) if f : ρ → ρ and R<sup>i</sup> : Rel(Ai, Bi), i = 1, ..., k, then if (x, y) ∈ H∗ρ R then (H<sup>1</sup>(π1f) A x, H<sup>2</sup>(π2f) B y) <sup>∈</sup> <sup>H</sup>∗ρ <sup>R</sup>.

Computation of ω-directed colimits in RT<sup>k</sup> extends componentwise to colimits in RelEnv. Similarly, ω-cocontinuity for endofunctors on RT<sup>k</sup> extends to functors from RelEnv to RTk. Our relational interpretation -·<sup>R</sup>e<sup>l</sup> : F → [RelEnv, Rel] is given in Figure 3. It ensures that -<sup>Γ</sup> <sup>F</sup> <sup>→</sup> <sup>G</sup><sup>R</sup>e<sup>l</sup> and -<sup>Γ</sup> ∀α.F<sup>R</sup>e<sup>l</sup> are as expected. As for set interpretations, -<sup>Γ</sup>; ∅ Nat<sup>α</sup>F G<sup>R</sup>e<sup>l</sup> is <sup>ω</sup>-cocontinuous because it is constant on <sup>ω</sup>-directed sets. If <sup>ρ</sup> <sup>∈</sup> RelEnv we write - <sup>F</sup><sup>R</sup>e<sup>l</sup> for - <sup>F</sup><sup>R</sup>e<sup>l</sup> ρ. For the last clause in Figure 3 to be well-defined we need TH,ρ to be an ω-cocontinuous endofunctor on RT, so that it admits a fixpoint. Since TH,ρ is defined in terms of -<sup>Γ</sup>; <sup>φ</sup><sup>k</sup>, <sup>α</sup> <sup>H</sup><sup>R</sup>e<sup>l</sup> , this means that relational interpretations of types must be ω-cocontinuous functors from RelEnv to RT0, which in turn entails that the actions of relational interpretations of types on objects and on morphisms in RelEnv are intertwined. We know from [17] that, for every Γ; α F, -<sup>Γ</sup>; <sup>α</sup> <sup>F</sup><sup>R</sup>e<sup>l</sup> is actually in [Rel<sup>k</sup>, Rel] where <sup>k</sup> <sup>=</sup> <sup>|</sup>α|. We first define the actions of each of these functors on morphisms between relation environments, and then argue that they are well-defined and have the required properties. To do this, we

Γ; <sup>Φ</sup> **<sup>0</sup>** Rel ρ = 0 Γ; <sup>Φ</sup> **<sup>1</sup>** Rel ρ = 1 Γ; <sup>∅</sup> Nat<sup>α</sup> <sup>F</sup> <sup>G</sup> Rel <sup>ρ</sup> <sup>=</sup> {<sup>η</sup> : λR. Γ; <sup>α</sup> <sup>F</sup> Rel <sup>ρ</sup>[<sup>α</sup> := <sup>R</sup>] <sup>⇒</sup> λR. Γ; <sup>α</sup> <sup>G</sup> Rel ρ[α := R]} = {(t,t - ) <sup>∈</sup> Γ; <sup>∅</sup> Nat<sup>α</sup> <sup>F</sup> <sup>G</sup> <sup>S</sup>et(π1ρ) <sup>×</sup> Γ; <sup>∅</sup> Nat<sup>α</sup> <sup>F</sup> <sup>G</sup> <sup>S</sup>et(π2ρ) <sup>|</sup> <sup>∀</sup>R<sup>1</sup> : Rel(A1, B1)... Rk : Rel(Ak, Bk). (tA,t - B) <sup>∈</sup> (Γ; <sup>α</sup> <sup>G</sup> Rel ρ[α := R])-Γ ;αF Re<sup>l</sup> ρ[α:=R] } Γ; <sup>Φ</sup> φF Rel <sup>ρ</sup> <sup>=</sup> (ρφ)Γ; <sup>Φ</sup> <sup>F</sup><sup>R</sup>e<sup>l</sup> ρ Γ; <sup>Φ</sup> <sup>F</sup> <sup>+</sup> <sup>G</sup> Rel <sup>ρ</sup> <sup>=</sup> Γ; <sup>Φ</sup> <sup>F</sup> Rel <sup>ρ</sup> <sup>+</sup> Γ; <sup>Φ</sup> <sup>G</sup> Rel ρ Γ; <sup>Φ</sup> <sup>F</sup> <sup>×</sup> <sup>G</sup> Rel <sup>ρ</sup> <sup>=</sup> Γ; <sup>Φ</sup> <sup>F</sup> Rel <sup>ρ</sup> <sup>×</sup> Γ; <sup>Φ</sup> <sup>G</sup> Rel ρ Γ; <sup>Φ</sup> (μφ.λα.H)G Rel <sup>ρ</sup> <sup>=</sup> (μTH,ρ)Γ; <sup>Φ</sup> <sup>G</sup><sup>R</sup>e<sup>l</sup> ρ where <sup>T</sup>H,ρ = (T<sup>S</sup>et H,π1ρ, T<sup>S</sup>et H,π2ρ, T <sup>R</sup>e<sup>l</sup> H,ρ) and <sup>T</sup> <sup>R</sup>e<sup>l</sup> H,ρ <sup>F</sup> <sup>=</sup> λR.Γ; φ, <sup>α</sup> <sup>H</sup> Rel ρ[φ := F][α := R] and <sup>T</sup> <sup>R</sup>e<sup>l</sup> H,ρ <sup>δ</sup> <sup>=</sup> λR.Γ; φ, <sup>α</sup> <sup>H</sup> Rel idρ[<sup>φ</sup> := <sup>δ</sup>][<sup>α</sup> := id R] **Fig. 3.** Relational interpretation

extend T<sup>H</sup> to a *functor* from RelEnv to [[Rel<sup>k</sup>, Rel], [Rel<sup>k</sup>, Rel]]. Its action on an object ρ ∈ RelEnv is given by the higher-order functor TH,ρ whose actions on objects and morphisms are given in Figure 3. Its action on a morphism f : ρ → ρ is the higher-order natural transformation TH,f : TH,ρ → TH,ρ whose action on any <sup>F</sup> : [Rel<sup>k</sup>, Rel] is the natural transformation <sup>T</sup>H,f <sup>F</sup> : <sup>T</sup>H,ρ <sup>F</sup> <sup>→</sup> <sup>T</sup>H,ρ- F whose component at <sup>R</sup> is (TH,f <sup>F</sup>)<sup>R</sup> <sup>=</sup> -<sup>Γ</sup>; φ, <sup>α</sup> <sup>H</sup><sup>R</sup>e<sup>l</sup> f[φ := *id* <sup>F</sup> ][α := *id* <sup>R</sup>].

Using TH, we can define the functorial action of relational interpretation. The action -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>R</sup>e<sup>l</sup> <sup>f</sup> of -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>R</sup>e<sup>l</sup> on <sup>f</sup> : <sup>ρ</sup> <sup>→</sup> <sup>ρ</sup> in RelEnv is given as in Definition 2, except that all interpretations are relational interpretations and all occurrences of T <sup>S</sup>et H,f are replaced by TH,f . For this definition and Figure 3 to be well-defined we need that, for every H, TH,ρ F is a relation transformer, and TH,f F : TH,ρ F → TH,ρ- F is a morphism of relation transformers, whenever F is a relation transformer and f : ρ → ρ is in RelEnv. This is immediate from

$$\mathbb{E}\left[\left\lbrack\boldsymbol{I};\Phi\vdash\boldsymbol{F}\right\rbrack\right] = \left(\left\lbrack\boldsymbol{I};\Phi\vdash\boldsymbol{F}\right\rbrack^{\mathsf{Set}}, \left\lbrack\boldsymbol{I};\Phi\vdash\boldsymbol{F}\right\rbrack^{\mathsf{Set}}, \left\lbrack\boldsymbol{I};\Phi\vdash\boldsymbol{F}\right\rbrack^{\mathsf{Rel}}\right) \in \left[\mathsf{Rel}\mathsf{Env},\boldsymbol{RT\_{0}}\right] \quad(1)$$

The proof is a straightforward induction on the structure of F, using an appropriate result from [17] to deduce <sup>ω</sup>-cocontinuity of -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup> in each case.

We can prove by simultaneous induction that set and relational interpretations of types respect demotion of functorial variables to non-functorial ones and, for <sup>D</sup> ∈ {Set, Rel}, -<sup>Γ</sup>; <sup>Φ</sup> <sup>G</sup>[<sup>α</sup> := <sup>K</sup>]<sup>D</sup><sup>ρ</sup> <sup>=</sup> -<sup>Γ</sup>; Φ, <sup>α</sup> <sup>G</sup><sup>D</sup>ρ[<sup>α</sup> := -<sup>Γ</sup>; <sup>Φ</sup> <sup>K</sup><sup>D</sup>ρ], and -<sup>Γ</sup>; <sup>Φ</sup> <sup>G</sup>[<sup>α</sup> := <sup>K</sup>]<sup>D</sup><sup>f</sup> <sup>=</sup> -<sup>Γ</sup>; Φ, <sup>α</sup> <sup>G</sup><sup>D</sup>f[<sup>α</sup> := -<sup>Γ</sup>; <sup>Φ</sup> <sup>K</sup><sup>D</sup>f], and -Γ; Φ <sup>F</sup>[<sup>φ</sup> := <sup>H</sup>]<sup>D</sup><sup>ρ</sup> <sup>=</sup> -<sup>Γ</sup>; Φ, φ <sup>F</sup><sup>D</sup>ρ[<sup>φ</sup> := λA. -<sup>Γ</sup>; Φ, <sup>α</sup> <sup>H</sup><sup>D</sup>ρ[<sup>α</sup> := <sup>A</sup>]], and, finally, -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup>[<sup>φ</sup> := <sup>H</sup>]<sup>D</sup><sup>f</sup> <sup>=</sup> -<sup>Γ</sup>; Φ, φ <sup>F</sup><sup>D</sup>f[<sup>φ</sup> := λA. -<sup>Γ</sup>; Φ, <sup>α</sup> <sup>H</sup><sup>D</sup>f[<sup>α</sup> := *id* <sup>A</sup>]].

#### **4 The Identity Extension Lemma**

In most treatments of parametricity, equality relations are taken as *given*, either directly as diagonal relations or perhaps via reflexive graphs. By contrast, we

give a categorical definition of graph relations for natural transformations and *construct* equality relations as particular such relations. Our definitions specialize to the usual ones for morphisms between sets and equality relations on sets.

The standard definition (x, y) ∈ f iff fx = y of the graph f of a morphism f : A → B in Set naturally generalizes to associate to each natural transformation between k-ary functors on Set a k-ary relation transformer. Indeed, if F, G : Set<sup>k</sup> <sup>→</sup> Set and <sup>α</sup> : <sup>F</sup> <sup>→</sup> <sup>G</sup> is a natural transformation, then the functor <sup>α</sup> <sup>∗</sup> : Rel<sup>k</sup> <sup>→</sup> Rel is defined as follows. Given <sup>R</sup><sup>1</sup> : Rel(A1, B1), ..., R<sup>k</sup> : Rel(Ak, Bk), let ι<sup>R</sup><sup>i</sup> : R<sup>i</sup> %→ A<sup>i</sup> × Bi, for i = 1, ..., k, be the inclusion of R<sup>i</sup> as a subset of A<sup>i</sup> × Bi, let <sup>h</sup><sup>A</sup>×<sup>B</sup> be the unique morphism making the left diagram below commute, and let <sup>h</sup><sup>R</sup> : FR <sup>→</sup> FA <sup>×</sup> GB be <sup>h</sup><sup>A</sup>×<sup>B</sup> ◦ FιR. Further, let <sup>α</sup>∧<sup>R</sup> be the subobject through which h<sup>R</sup> is factorized by the mono-epi factorization system in Set, as in the right diagram below. Then α∧R : Rel(FA, GB) by construction, so the action of α<sup>∗</sup> on objects can be given by α<sup>∗</sup>(A, B, R)=(FA, GB, ι<sup>α</sup>∧<sup>R</sup>α∧R). Its action on morphisms is given by α<sup>∗</sup>(β, β )=(Fβ,Gβ ).

**Lemma 1.** *If* F, G : [Set<sup>k</sup>, Set]*, and if* <sup>α</sup> : <sup>F</sup> <sup>→</sup> <sup>G</sup> *is a natural transformation, then the* graph relation transformer for α *defined by* α = (F, G,α<sup>∗</sup>) *is in* RTk*.*

The action of a graph relation transformer on a graph relation can be computed explicitly: if <sup>α</sup> : <sup>F</sup> <sup>→</sup> <sup>G</sup> is a morphism in [Set<sup>k</sup>, Set] and <sup>f</sup><sup>1</sup> : <sup>A</sup><sup>1</sup> <sup>→</sup> <sup>B</sup>1, ..., f<sup>k</sup> : A<sup>k</sup> → Bk, then α<sup>∗</sup>f = Gf ◦ α<sup>A</sup> = α<sup>B</sup> ◦ Ff.

To prove the IEL we also need to know that equality relation transformers preserve equality relations. The *equality relation transformer* on F : [Set<sup>k</sup>, Set] is Eq<sup>F</sup> = *id* <sup>F</sup>  = (F, F,*id* <sup>F</sup> <sup>∗</sup>). The above definition then gives that, for all A : Set, Eq<sup>∗</sup> <sup>F</sup> Eq<sup>A</sup> = *id* <sup>F</sup> <sup>∗</sup>*id* <sup>A</sup> = F*id* <sup>A</sup> ◦ (*id* <sup>F</sup> )<sup>A</sup> = *id* <sup>F</sup> <sup>A</sup> ◦ *id* <sup>F</sup> <sup>A</sup> = *id* <sup>F</sup> <sup>A</sup> = Eq<sup>F</sup> <sup>A</sup>. In addition, if ρ, ρ ∈ SetEnv and f : ρ → ρ , then the *graph relation environment* f is defined pointwise by fφ = fφ for every φ. This entails that π<sup>1</sup>f = ρ and π<sup>2</sup>f = ρ . The *equality relation environment* Eq<sup>ρ</sup> is defined to be *id* <sup>ρ</sup>. Our IEL is thus:

**Theorem 1 (IEL).** *If* <sup>ρ</sup> <sup>∈</sup> SetEnv*, then* -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>R</sup>e<sup>l</sup> Eq<sup>ρ</sup> = Eq-<sup>Γ</sup>;Φ<sup>F</sup> Set<sup>ρ</sup>*.*

The IEL's highly non-trivial proof is by induction on the structure of F. Only the Nat, application, and fixpoint cases are non-routine. The latter two explicitly calculate actions of graph relation transformers as above. The fixpoint case also uses that, for every n ∈ **N**, the following intermediate results can be proved by simultaneous induction with Theorem 1: for any H, ρ, A, and subformula J of H, both T <sup>n</sup> H,Eq<sup>ρ</sup> K<sup>0</sup> Eq<sup>A</sup> = (Eq(TSet H,ρ)nK<sup>0</sup> )∗Eq<sup>A</sup> and -<sup>Γ</sup>; Φ, φ, <sup>α</sup> <sup>J</sup><sup>R</sup>e<sup>l</sup> Eqρ[φ := T n H,Eq<sup>ρ</sup> <sup>K</sup>0][<sup>α</sup> := <sup>E</sup>qA] = -<sup>Γ</sup>; Φ, φ, <sup>α</sup> <sup>J</sup><sup>R</sup>e<sup>l</sup> Eqρ[φ := Eq(TSet H,ρ)nK<sup>0</sup> ][α := EqA] hold. Γ; <sup>Φ</sup> <sup>|</sup> Δ, <sup>x</sup> : <sup>F</sup> <sup>x</sup> : <sup>F</sup> <sup>D</sup><sup>ρ</sup> <sup>=</sup> <sup>π</sup><sup>|</sup>Δ|+1 Γ; ∅ | <sup>Δ</sup> <sup>L</sup>αx.t : Nat<sup>α</sup> <sup>F</sup> <sup>G</sup> <sup>D</sup><sup>ρ</sup> <sup>=</sup> curry(Γ; <sup>α</sup> <sup>|</sup> Δ, <sup>x</sup> : <sup>F</sup> <sup>t</sup> : <sup>G</sup> <sup>D</sup>ρ[α := ]) Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup>K<sup>s</sup> : <sup>G</sup>[<sup>α</sup> := <sup>K</sup>] <sup>D</sup><sup>ρ</sup> <sup>=</sup> eval ◦ λd.(Γ; ∅ | <sup>Δ</sup> <sup>t</sup> : Nat<sup>α</sup> <sup>F</sup> <sup>G</sup> <sup>D</sup>ρ d)-Γ ;ΦKDρ, Γ; <sup>Φ</sup><sup>|</sup> <sup>Δ</sup> <sup>s</sup> : <sup>F</sup>[<sup>α</sup> := <sup>K</sup>] Dρ Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>⊥</sup>F <sup>t</sup> : <sup>F</sup> <sup>D</sup>ρ = !<sup>0</sup> -Γ ;ΦF D<sup>ρ</sup> ◦ Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : **<sup>0</sup>** <sup>D</sup>ρ, where ! 0 -Γ ;ΦF D<sup>ρ</sup> is the unique morphism from 0 to Γ; <sup>Φ</sup> <sup>F</sup> Dρ Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> : **<sup>1</sup>** <sup>D</sup>ρ = !-Γ ;ΦΔDρ <sup>1</sup> , where !-Γ ;ΦΔDρ 1 is the unique morphism from Γ; <sup>Φ</sup> <sup>Δ</sup> <sup>D</sup>ρ to 1 Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> (s, t) : <sup>F</sup> <sup>×</sup> <sup>G</sup> <sup>D</sup><sup>ρ</sup> <sup>=</sup> Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>s</sup> : <sup>F</sup> <sup>D</sup><sup>ρ</sup> <sup>×</sup> Γ; <sup>Φ</sup><sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>G</sup> Dρ Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>π</sup>1<sup>t</sup> : <sup>F</sup> <sup>D</sup><sup>ρ</sup> <sup>=</sup> <sup>π</sup><sup>1</sup> ◦ Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>F</sup> <sup>×</sup> <sup>G</sup> Dρ Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>π</sup>2<sup>t</sup> : <sup>G</sup> <sup>D</sup><sup>ρ</sup> <sup>=</sup> <sup>π</sup><sup>2</sup> ◦ Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>F</sup> <sup>×</sup> <sup>G</sup> Dρ Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> case <sup>t</sup> of {<sup>x</sup> -→ l; y -<sup>→</sup> <sup>r</sup>} : <sup>K</sup> <sup>D</sup><sup>ρ</sup> <sup>=</sup> eval ◦ curry [Γ; <sup>Φ</sup><sup>|</sup> Δ, x : <sup>F</sup> <sup>l</sup> : <sup>K</sup> <sup>D</sup>ρ, Γ; <sup>Φ</sup> <sup>|</sup> Δ, y : <sup>G</sup> <sup>r</sup> : <sup>K</sup> <sup>D</sup>ρ], Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>F</sup> <sup>+</sup> <sup>G</sup> Dρ Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> inL <sup>s</sup> : <sup>F</sup> <sup>+</sup> <sup>G</sup> <sup>D</sup><sup>ρ</sup> <sup>=</sup> inL ◦ Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>s</sup> : <sup>F</sup> Dρ Γ; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> inR <sup>t</sup> : <sup>F</sup> <sup>+</sup> <sup>G</sup> <sup>D</sup><sup>ρ</sup> <sup>=</sup> inR ◦ Γ; <sup>Φ</sup><sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>G</sup> Dρ Γ; ∅|∅ map<sup>F</sup> ,G H : Nat∅(Natβ,γ<sup>F</sup> <sup>G</sup>) = λd <sup>η</sup> C. Γ; φ, <sup>γ</sup> <sup>H</sup> <sup>D</sup>idρ[γ:=C][<sup>φ</sup> := λB.ηB C ] (Nat<sup>γ</sup> H[φ :=β <sup>F</sup>] <sup>H</sup>[<sup>φ</sup> :=β <sup>G</sup>]) Dρ Γ; <sup>∅</sup> |∅ in<sup>H</sup> : Nat<sup>β</sup> <sup>H</sup>[<sup>φ</sup> := (μφ.λα.H)β][<sup>α</sup> := <sup>β</sup>] <sup>=</sup> λd. inT <sup>X</sup> H,ρ where X is Set when (μφ.λα.H)β <sup>D</sup>ρ D = Set and not present when D = Rel Γ; ∅|∅ fold<sup>F</sup> <sup>H</sup> : Nat<sup>∅</sup> (Nat<sup>β</sup> <sup>H</sup>[<sup>φ</sup> :=β <sup>F</sup>][<sup>α</sup> := <sup>β</sup>] <sup>F</sup>) <sup>=</sup> λd. fold <sup>T</sup> <sup>X</sup> H,ρ (Nat<sup>β</sup> (μφ.λα.H)<sup>β</sup> <sup>F</sup>) <sup>D</sup>ρ where X is as above **Fig. 4.** Term semantics

The case of the proof when F and J are both μ-types makes clear that if functorial variables of arity greater than 0 were allowed to appear in the bodies of μ-types, then the IEL would fail.

With the IEL in hand we can prove a Graph Lemma for our setting:

**Lemma 2.** *If* ρ, ρ ∈ SetEnv *and* f : ρ → ρ *then*


#### **5 Interpreting Terms**

If <sup>Δ</sup> <sup>=</sup> <sup>x</sup><sup>1</sup> : <sup>F</sup>1, ..., x<sup>n</sup> : <sup>F</sup><sup>n</sup> is a term context for <sup>Γ</sup> and <sup>Φ</sup>, define -<sup>Γ</sup>; <sup>Φ</sup> <sup>Δ</sup><sup>D</sup> <sup>=</sup> -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>1</sup><sup>D</sup> <sup>×</sup> ... <sup>×</sup> -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>n</sup><sup>D</sup>, where <sup>D</sup> is Set or Rel as appropriate. Then every well-formed term has a set (resp., relational) interpretation as a natural transformation from the set (resp., relational) interpretation of its term context to that of its type. These interpretations, given in Figure 4, respect weakening, so that -<sup>Γ</sup>; <sup>Φ</sup> <sup>|</sup> Δ, x : <sup>F</sup> <sup>t</sup> : <sup>G</sup><sup>D</sup><sup>ρ</sup> = (-<sup>Γ</sup>; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>G</sup><sup>D</sup>ρ)◦πΔ, where <sup>ρ</sup> <sup>∈</sup> SetEnv or <sup>ρ</sup> <sup>∈</sup> RelEnv, and <sup>π</sup><sup>Δ</sup> is the projection -<sup>Γ</sup>; <sup>Φ</sup> Δ, x : <sup>F</sup><sup>D</sup> <sup>→</sup> -<sup>Γ</sup>; <sup>Φ</sup> <sup>Δ</sup><sup>D</sup>.

The return type for the semantic fold is -<sup>Γ</sup>; <sup>β</sup> <sup>F</sup><sup>D</sup>ρ[<sup>β</sup> := <sup>B</sup>]. This interpretation gives -<sup>Γ</sup>; ∅ | <sup>Δ</sup> λx.t : <sup>F</sup> <sup>→</sup> <sup>G</sup><sup>D</sup><sup>ρ</sup> <sup>=</sup> curry(-<sup>Γ</sup>; ∅ | Δ, x : <sup>F</sup> <sup>t</sup> : <sup>G</sup><sup>D</sup>ρ) and -<sup>Γ</sup>; ∅ | <sup>Δ</sup> st : <sup>G</sup><sup>D</sup><sup>ρ</sup> <sup>=</sup> eval ◦ -<sup>Γ</sup>; ∅ | <sup>Δ</sup> <sup>s</sup> : <sup>F</sup> <sup>→</sup> <sup>G</sup><sup>D</sup>ρ, -<sup>Γ</sup>; ∅ | <sup>Δ</sup> <sup>t</sup> : <sup>F</sup><sup>D</sup><sup>ρ</sup>, so it specializes to the standard interpretations for System F terms. If t is closed, i.e., if <sup>∅</sup>; ∅|∅ <sup>t</sup> : <sup>F</sup>, then we write <sup>t</sup> : <sup>F</sup><sup>D</sup> instead of -<sup>∅</sup>; ∅|∅ <sup>t</sup> : <sup>F</sup><sup>D</sup>. In addition, term interpretation respects substitution for both functorial and non-functorial type variables, as well as term substitution. Direct calculation reveals that interpretations of terms also satisfy -<sup>Γ</sup>; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> (Lαx.t)Ks<sup>D</sup> <sup>=</sup> -<sup>Γ</sup>; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup>[<sup>α</sup> := <sup>K</sup>][<sup>x</sup> := <sup>s</sup>]<sup>D</sup>. Term extensionality for both types and terms i.e., -<sup>Γ</sup>; <sup>Φ</sup> (Lαx.t)<sup>α</sup> : <sup>F</sup><sup>D</sup> <sup>=</sup> -<sup>Γ</sup>; <sup>Φ</sup> <sup>t</sup> : <sup>F</sup><sup>D</sup> and -<sup>Γ</sup>; <sup>Φ</sup> (Lαx.t)α<sup>x</sup> : <sup>F</sup><sup>D</sup> <sup>=</sup> -<sup>Γ</sup>; <sup>Φ</sup> <sup>t</sup> : <sup>F</sup><sup>D</sup> — follow (when both sides of these equations are defined).

### **6 Free Theorems for Nested Types**

#### **6.1 Consequences of Naturality**

Define, for <sup>Γ</sup>; <sup>α</sup> <sup>F</sup>, the term *id* <sup>F</sup> to be <sup>Γ</sup>; ∅|∅ <sup>L</sup>αx.x : Nat<sup>α</sup>F F and, for terms <sup>Γ</sup>; ∅ | <sup>Δ</sup> <sup>t</sup> : Nat<sup>α</sup>F G and <sup>Γ</sup>; ∅ | <sup>Δ</sup> <sup>s</sup> : Nat<sup>α</sup>G H, the *composition* <sup>s</sup>◦<sup>t</sup> of <sup>t</sup> and <sup>s</sup> to be <sup>Γ</sup>; ∅ | <sup>Δ</sup> <sup>L</sup>αx.sα(tαx) : Nat<sup>α</sup>F H. Then -<sup>Γ</sup>; ∅|∅ *id* <sup>F</sup> : Nat<sup>α</sup>F F<sup>S</sup>et<sup>ρ</sup> <sup>∗</sup> <sup>=</sup> *id*λA.-<sup>Γ</sup>;α<sup>F</sup> Setρ[α:=A] for any set environment <sup>ρ</sup> and -<sup>Γ</sup>; ∅ | <sup>Δ</sup> <sup>s</sup>◦<sup>t</sup> : Nat<sup>α</sup>F H<sup>S</sup>et = -<sup>Γ</sup>; ∅ | <sup>Δ</sup> <sup>s</sup> : Nat<sup>α</sup>G H<sup>S</sup>et◦-<sup>Γ</sup>; ∅ | <sup>Δ</sup> <sup>t</sup> : Nat<sup>α</sup>F G<sup>S</sup>et. Also, terms of Nat type behave as natural transformations with respect to their source and target types:

**Theorem 2.** *If* <sup>Γ</sup>; ∅ | <sup>Δ</sup> <sup>s</sup> : Natα,γF G *and* <sup>Γ</sup>; ∅ | <sup>Δ</sup> <sup>t</sup> : Nat<sup>γ</sup>K H*, then* -<sup>Γ</sup>; ∅ | <sup>Δ</sup> ((mapK,H <sup>G</sup> )<sup>∅</sup> <sup>t</sup>) ◦ (Lγz.sK,γz) : Nat<sup>γ</sup>F[<sup>α</sup> := <sup>K</sup>] <sup>G</sup>[<sup>α</sup> := <sup>H</sup>]<sup>S</sup>et = -<sup>Γ</sup>; ∅ | <sup>Δ</sup> (Lγz.sH,γz) ◦ ((mapK,H <sup>F</sup> )<sup>∅</sup> <sup>t</sup>) : Nat<sup>γ</sup>F[<sup>α</sup> := <sup>K</sup>] <sup>G</sup>[<sup>α</sup> := <sup>H</sup>]<sup>S</sup>et

Theorem 2 gives rise to an entire family of free theorems that are consequences of naturality, and thus do not require the full power of parametricity. In particular, we can prove that the interpretation of every map<sup>H</sup> is a functor, and that map is itself a higher-order functor. For example, the former property can be stated as: if <sup>Γ</sup>; α, <sup>γ</sup> <sup>H</sup>, <sup>Γ</sup>; ∅ | <sup>Δ</sup> <sup>g</sup> : Nat<sup>γ</sup>F G, and <sup>Γ</sup>; ∅ | <sup>Δ</sup> <sup>f</sup> : Nat<sup>γ</sup>G K, then

$$\begin{split} & \left[ I ; \emptyset \mid \Delta \vdash (\mathsf{map}\_{H}^{\overline{F},\overline{K}})\_{\emptyset} \overline{(f \circ g)} : \mathsf{Nat}^{\overline{\gamma}} H[\overline{\alpha := F}] \, H[\overline{\alpha := K}] \right]^{\mathsf{Set}} \\ &= \left[ I ; \emptyset \mid \Delta \vdash (\mathsf{map}\_{H}^{\overline{G},\overline{K}})\_{\emptyset} \overline{f} \circ (\mathsf{map}\_{H}^{\overline{F},\overline{G}})\_{\emptyset} \overline{g} : \mathsf{Nat}^{\overline{\gamma}} H[\overline{\alpha := F}] \, H[\overline{\alpha := K}] \right]^{\mathsf{Set}}. \end{split}$$

We can also prove the expected properties of map, in, and fold, and their interpretations, e.g., uniqueness and the universal property of the interpretation of fold, and the interpretation of in is an isomorphism.

#### **6.2 The Abstraction Theorem**

To get consequences of parametricity that are not merely consequences of naturality, we prove an Abstraction Theorem (Theorem 4). As usual for such theorems, we prove a more general result (Theorem 3) for open terms, and recover our Abstraction Theorem as its special case for closed terms of closed type.

**Theorem 3.** *Every well-formed term* Γ; Φ | Δ t : F *induces a natural transformation from* -<sup>Γ</sup>; <sup>Φ</sup> <sup>Δ</sup> *to* -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup>*, i.e., a triple of natural transformations* (-<sup>Γ</sup>; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>F</sup><sup>S</sup>et, -<sup>Γ</sup>; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>F</sup><sup>S</sup>et, -<sup>Γ</sup>; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>F</sup><sup>R</sup>e<sup>l</sup> )*, where, for* D ∈ {Set, Rel}*, and for* <sup>ρ</sup> <sup>∈</sup> SetEnv *or* <sup>ρ</sup> <sup>∈</sup> RelEnv *as appropriate,* -<sup>Γ</sup>; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>F</sup><sup>D</sup> : -<sup>Γ</sup>; <sup>Φ</sup> <sup>Δ</sup><sup>D</sup> <sup>→</sup> -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>D</sup> *has component* -<sup>Γ</sup>; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>F</sup><sup>D</sup><sup>ρ</sup> : -<sup>Γ</sup>; <sup>Φ</sup> <sup>Δ</sup><sup>D</sup><sup>ρ</sup> → -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>D</sup><sup>ρ</sup> *at* <sup>ρ</sup>*. Moreover, for all* <sup>ρ</sup> <sup>∈</sup> RelEnv*, we have* -<sup>Γ</sup>; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>F</sup><sup>R</sup>e<sup>l</sup> ρ = (-<sup>Γ</sup>; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>F</sup><sup>S</sup>et(π1ρ), -<sup>Γ</sup>; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>F</sup><sup>S</sup>et(π2ρ))*.*

The proof is by induction on t. It requires showing that set and relational interpretations of term judgments are natural transformations, and that all set interpretations of terms of Nat-types satisfy the appropriate equality preservation conditions from Figure 2. For the interesting cases of abstraction, application, map, in, and fold terms, propagating the naturality conditions is somewhat involved; the latter two especially require some delicate diagram chasing. That it is possible provides strong evidence that our development is sensible, natural, and at an appropriate level of abstraction.

Using Theorem 3 we can prove that our calculus admits no terms with the type Nat<sup>α</sup>**1** α of the polymorphic bottom, and every closed term g of type Nat<sup>α</sup>α α denotes the polymorphic identity function. Moreover, an immediate consequence of Theorem <sup>3</sup> is that if <sup>ρ</sup> <sup>∈</sup> RelEnv, and (a, b) <sup>∈</sup> -<sup>Γ</sup>; <sup>Φ</sup> <sup>Δ</sup><sup>R</sup>e<sup>l</sup> ρ, then (-<sup>Γ</sup>; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>F</sup><sup>S</sup>et(π1ρ) a , -<sup>Γ</sup>; <sup>Φ</sup> <sup>|</sup> <sup>Δ</sup> <sup>t</sup> : <sup>F</sup><sup>S</sup>et(π2ρ) <sup>b</sup>) <sup>∈</sup> -<sup>Γ</sup>; <sup>Φ</sup> <sup>F</sup><sup>R</sup>e<sup>l</sup> ρ. Its instantiation to closed terms of closed type gives

#### **Theorem 4 (Abstraction Theorem).** ( <sup>t</sup> : <sup>F</sup><sup>S</sup>et, <sup>t</sup> : <sup>F</sup><sup>S</sup>et) <sup>∈</sup> -<sup>F</sup><sup>R</sup>e<sup>l</sup>

Using Theorem 4 we can recover free theorems, such as that for the type of the standard *filter* function for lists, that go beyond mere naturality, and extend them to those nested types for which analogous functions can be defined. In particular, we can extend short cut fusion for lists [10] to nested types, thereby formally proving correctness of the categorically inspired theorem from [16]. As shown there, replacing **1** with any type ∅; α C generalizes Theorem 5 to a free theorem whose conclusion is *fold*<sup>H</sup> <sup>B</sup> ◦ G μH *in*<sup>H</sup> <sup>=</sup> <sup>G</sup> -<sup>∅</sup>; <sup>α</sup> <sup>K</sup><sup>S</sup>et <sup>B</sup>.

**Theorem 5.** *If* ∅; φ, α F*,* ∅; α K*,* H : [Set, Set] → [Set, Set] *is defined by* Hfx <sup>=</sup> -<sup>∅</sup>; φ, α <sup>F</sup><sup>S</sup>et[<sup>φ</sup> := <sup>f</sup>][<sup>α</sup> := <sup>x</sup>]*, and* <sup>G</sup> <sup>=</sup> <sup>φ</sup>; ∅|∅ <sup>g</sup> : Nat<sup>∅</sup> (Nat<sup>α</sup> <sup>F</sup> (φα)) (Nat<sup>α</sup> **<sup>1</sup>** (φα))<sup>S</sup>et *for some* <sup>g</sup>*, then for every* <sup>B</sup> <sup>∈</sup> <sup>H</sup>-<sup>∅</sup>; <sup>α</sup> <sup>K</sup><sup>S</sup>et <sup>→</sup> -<sup>∅</sup>; <sup>α</sup> <sup>K</sup><sup>S</sup>et *we have fold*<sup>H</sup> <sup>B</sup> (G μH *in*H) = <sup>G</sup> -<sup>∅</sup>; <sup>α</sup> <sup>K</sup><sup>S</sup>et <sup>B</sup>*.*

### **7 Conclusion and Directions for Future Work**

We have constructed a parametric model for a calculus supporting primitive nested types, and used its Abstraction Theorem to derive free theorems for these types. This was not possible before [17] because these types were not previously known to have well-defined interpretations in locally finitely presentable categories (here, Set and Rel), and, to our knowledge, no term calculus for them existed either. We naturally hope (some appropriate variant of) the construction elaborated here will generalize to more advanced data types. For example, GADTs can be represented using left Kan extensions, and it was shown in [17] that adding a Lan construct to a calculus such as ours preserves the λ-cocontinuity needed for the data types it defines to have well-defined interpretations in locally λ-presentable categories. (Interestingly, λ > ℵ<sup>1</sup> is required to interpret even common GADTs.) This suggests carrying out our model construction in locally λ-presentable cartesian closed categories (lpcccs) C whose categories of (abstract) relations, obtained by pullback as in [13], are also lpcccs and are appropriately fibred over C. Adding term-level fixpoints further requires our semantic categories not just to be locally λ-presentable, but to support some kind of domain structure as well.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **The Spirit of Node Replication**

Delia Kesner<sup>1</sup>,<sup>2</sup> , Lo¨ıc Peyrot <sup>1</sup>, and Daniel Ventura<sup>3</sup> 

> <sup>1</sup> Universit´e de Paris, CNRS, IRIF, Paris, France {kesner,lpeyrot}@irif.fr <sup>2</sup> Institut Universitaire de France, France <sup>3</sup> Univ. Federal de Goi´as, Goiˆania, Brazil ventura@ufg.br

**Abstract.** We define and study a term calculus implementing higherorder node replication. It is used to specify two different (weak) evaluation strategies: call-by-name and fully lazy call-by-need, that are shown to be observationally equivalent by using type theoretical technical tools.

#### **1 Introduction**

Computation in the λ-calculus is based on higher-order substitution, a complex operation being able to erase and copy terms during evaluation. Several formalisms have been proposed to model higher-order substitution, going from explicit substitutions (ES) [1] (see a survey in [41]) and labeled systems [15] to pointer graphs [60] or optimal sharing graphs [49]. The model of copying behind each of these formalisms is not the same.

Indeed, suppose one wants to substitute all the free occurrences of some variable x in a term t by some term u. We can imagine at least four ways to do that. (1) A drastic solution is a one-shot substitution, called *non-linear* (or *full*) *substitution*, based on simultaneously replacing *all* the free occurrences of x in t by the whole term u. This notion is generally defined by induction on the structure of the term t. (2) A refined method substitutes *one* free occurrence of x at a time, the so-called *linear* (or *partial*) *substitution*. This notion is generally defined by induction on the number of free occurrences of x in the term t. An orthogonal approach can be taken by replicating *one* term-constructor of u *at a time*, instead of replicating u as a whole, called here *node replication*. This notion can be defined by induction on the structure of the term u, and also admits two versions: (3) non-linear, *i.e.* by simultaneously replacing all the occurrences of x in t, or (4) linear. The linear version of the node replication approach can be formally defined by combining (2) and (3).

It is not surprising that different notions of substitution give rise to different evaluation strategies. Indeed, linear substitution is the common model in wellknown abstract machines for call-by-name and call-by-value (see *e.g.* [3]), while (linear) node replication is used to implement fully lazy sharing [60]. However, node replication, originally introduced to implement optimal graph reduction in

 Supported by CNPq grant Universal 430667/2016-7.

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 344–364, 2021.

https://doi.org/10.1007/978-3-030-71995-1 18

a graphical formalism, has only been studied from a Curry-Howard perspective by means of a term language known as the atomic λ-calculus [33].

**The Atomic Lambda-Calculus.** The Curry-Howard isomorphism uncovers a deep connection between logical systems and term calculi. It is then not surprising that different methods to implement substitution correspond to different ways to normalize logical proofs. Indeed, full substitution (1) can be explained in terms of natural deduction, while partial substitution (2) corresponds to cut elimination in Proof-Nets [2]. Replication of nodes (3)-(4) is based on a Curry-Howard interpretation of deep inference [32,33]. Indeed, the logical aspects of intuistionistic deep inference are captured by the atomic λ-calculus [33], where copying of terms proceeds *atomically*, *i.e.* node by node, similar to the optimal graph reduction of Lamping [49].

The atomic λ-calculus is based on *explicit control of resources* such as erasure and duplication. Its operational semantics explicitly handles the structural constructors of weakening and contraction, as in the calculus of resources λlxr [43,44]. As a result, comprehension of the meta-properties of the termcalculus, in a higher-level, and its application to concrete implementations of reduction strategies in programming languages, turn out to be quite difficult. In this paper, we take one step back, by studying the paradigm of *node replication* based on *implicit*, rather than *explicit*, weakening and contraction. This gives a new concise formulation of node replication which is simple enough to model different programming languages based on reduction strategies.

**Call-by-Name, Call-by-Value, Call-by-Need.** *Call-by-name* is used to implement programming languages in which arguments of functions are first copied, then evaluated. This is frequently expensive, and may be improved by *call-byvalue*, in which arguments are evaluated first, then consumed. The difference can be illustrated by the term t = Δ(II), where Δ = λx.xx and I = λz.z: call-by-name first duplicates the argument II, so that its evaluation is also duplicated, while call-by-value first reduces II to (the value) I, so that duplications of the argument do not cause any duplicated evaluation. It is not always the best solution, though, because evaluating erasable arguments is useless.

*Call-by-need*, instead, takes the best of call-by-name and call-by-value: as in call-by-name, erasable arguments are not evaluated at all, and as in call-byvalue, reduction of arguments occurs at most once. Furthermore, call-by-need implements a *demand-driven* evaluation, in which erasable arguments are never needed (so they are not evaluated), and non-erasable arguments are evaluated only if needed. Technically, some sharing mechanism is necessary, for example by extending the λ-calculus with explicit substitutions/let constructs [7]. Then βreduction is decomposed in at least two steps: one creating an explicit (pending) substitution, and the other ones (linearly) substituting *values*. Thus for example, (λx.xx)(II) reduces to (xx)[x\II], and the substitution argument is thus evaluated in order to find a value before performing the linear substitution.

Even when adopting this wise evaluation scheme, there are still some unnecessary copies of redexes: while only *values* (*i.e.* abstractions) are duplicated, they may contain redexes as subterms, *e.g.* λz.z(II) whose subterm II is a redex. Duplication of such values might cause redex duplications in *weak* (*i.e.* when evaluation is forbidden inside abstractions) call-by-need. This happens in particular in the *confluent* variant of weak reduction in [52].

**Full laziness.** Alas, it is not possible to keep all values shared forever, typically when they potentially contribute to the creation of a future β-reduction step. The key idea to gain in efficiency is then to keep the subterm II as a *shared* redex. Therefore, the (full) value λz.z(II) to be copied is split into two separate parts. The first one, called *skeleton*, contains the minimal information preserving the bound structure of the value, *i.e.* the linked structure between the binder and each of its (bound) variables. In our example, this is the term λz.zy, where y is a fresh variable. The second one is a multiset of *maximal free expressions* (MFE), representing all the shareable expressions (here only the term II). Only the skeleton is then copied, while the problematic redex II remains shared:

$$(\lambda x.xx)(\lambda z.z(\mathtt{11})) \to (xx)[x\backslash\lambda z.z(\mathtt{11})] \to ((\lambda z.zy)x)[x\backslash\lambda z.zy][y\backslash\mathtt{11}]$$

When the subterm II is needed ahead, it is first reduced inside the ES, as it is usual in (standard) call-by-need, thus avoiding to compute the redex twice. This optimization is called *fully lazy sharing* and is due to Wadsworth [60].

In the confluent weak setting evoked earlier [52], the fully lazy optimization is even optimal in the sense of L´evy [51]. This means that the strategy reaches the weak normal form in the same number of β-steps as the shortest possible weak reduction sequence in the usual λ-calculus without sharing. Thus, fully lazy sharing turns out to be a *decidable* optimal strategy, in contrast to other weak evaluation strategies in the λ-calculus without sharing, which are also optimal but not decidable [11].

**Contributions.** The first contribution of this paper is a term calculus implementing (full) node replication and internally encoding skeleton extraction (Sec. 2). We study some of its main operational properties: termination of the substitution calculus, confluence, and its relation with the λ-calculus.

Our second contribution is the use of the node replication paradigm to give an alternative specification of two evaluation strategies usually described by means of full or linear substitution: call-by-name (Sec. 4.1) and weak fully lazy reduction (Sec. 4.2), based on the key notion of skeleton. The former can be related to (weak) head reduction, while the latter is a fully lazy version of (weak) call-by-need. In contrast to other implementations of fully lazy reduction relying on (external) meta-level definitions, our implementation is based on formal operations internally defined over the term syntax of the calculus.

Furthermore, while it is known that call-by-name and call-by-need specified by means of full/linear substitution are observationally equivalent [7], it was not clear at first whether the same property would hold in our case. Our third contribution is a proof of this result (Sec. 6) using semantical tools coming from proof theory –notably intersection types. This proof technique [42] considerably simplifies other approaches [7,54] based on syntactical tools. Moreover, the use of intersection types has another important consequence: standard call-by-name and call-by-need turn out to be observationally equivalent to call-by-name and call-by-need with node replication, as well as to the more semantical notion of neededness (see [45]).

Intersection types provide quantitative information about fully lazy evaluation so that a fourth contribution of this work is a measure based on type derivations which turns out to be an upper bound to the length of reduction sequences to normal forms in a fully lazy implementation.

More generally, our work bridges the gap between the Curry-Howard theoretical understanding of node replication and concrete implementations of fully lazy sharing. Related works are presented in the concluding Sec. 7.

#### **2 A Calculus for Node Replication**

We now present the syntax and operational semantics of the λR-calculus (R for Replication), as well as a notion of *level* playing a key role in the next sections.

**Syntax.** Given a countably infinite set X of variables x, y, z, ..., we consider the following grammars.

**(Terms)** t, u ::= x | λx.t | tu | t[x\u] | t[x\\λy.u] **(Pure Terms)** p, q ::= x | λx.p | pq **(Term Contexts)** C ::= ✷| λx.C| Ct| tC| C[x\t]| C[x\\λy.u]| t[x\C]| t[x\\λy.C] **(List Contexts)** L ::= ✷| L[x\u] | L[x\\λy.u]

The set of terms (resp. **pure** terms) is denoted by Λ<sup>R</sup> (resp. Λ). We write |t| for the **size** of t, *i.e.* for its number of constructors. We write I for the identity function λx.x. The construction [x\u] is an **explicit substitution (ES)**, and [x\\λy.u] an **explicit distributor**: the first one is used to copy arbitrary terms, while the second one is used specifically to duplicate abstractions. We write [x&u] to denote an **explicit cut** in general, which is either [x\u] or [x\\u] when u is λy.u , typically to factorize some definitions and proofs where they behave similarly in both cases. When using the general notation t[x&u], we define x(&) = 1 if the term is an ES, and x(&) = 0 otherwise.

We use two notions of **contexts**. Term contexts C extend those of the λcalculus to explicit cuts. List contexts L denote an arbitrary list of explicit cuts. They will be used to implement reduction *at a distance* in the operational semantics defined ahead.

**Free**/**bound variables** of terms are defined as usual, notably fv(t[x&u]) := fv(t)\{x} ∪ fv(u). These notions are extended to contexts as expected, in particular fv(✷) := ∅. The **domain** of a **list context** is given by dlc(✷) := ∅ and dlc(L[x&u]) := dlc(L) ∪ {x}. α-conversion [13] is extended to λR-terms as expected and used to avoid capture of free variables. We write t{x\u} for the meta-level (capture-free) substitution simultaneously replacing all the free occurrences of the variable x in t by the term u.

The **application of a context** C **to a term** t, written Ct, replaces the hole ✷ of C by t. For instance, ✷t = t and (λx.✷)t = λx.t. This operation is not defined modulo α-conversion, so that capture of variables eventually happens. Thus, we also consider another kind of application of contexts to terms, denoted with double brackets, which is only defined if there is no capture of variables. For instance, (λy.✷)x = λy.x while (λx.✷)x is undefined.

**Operational semantics.** ES may block some expected *meaningful* (*i.e.* nonstructural) reductions. For instance, β-reduction is blocked in (λx.t)[y\v]u because an ES lies between the function and its argument. This kind of stuck redexes do not happen in graphical representations (*e.g.* [28]), but it is typical in the sequential structure of *term* syntaxes.

There are at least two ways to handle this issue. The first one is based on *structural/permutation* rules, as in [33], where the substitution is first pushed outside the application node, as (λx.t)[y\v]u → ((λx.t)u)[y\v], so that β-reduction is finally unblocked. The second, less elementary, possibility is given by an operational semantics *at a distance* [6,4], where the β-rule can be fired by a rule like Lλx.tu → Lt[x\u], L being an arbitrary list context. The distance paradigm is therefore used to gather meaningful and permutation rules in only one reduction step. In λR, we combine these two technical tools. First, we consider the following permutation rules, all of them are constrained by the condition x /∈ fv(t).

$$\begin{array}{ccc} \lambda x.u[y\lhd t] \mapsto\_{\pi} (\lambda x.u)[y\lhd t] & \qquad v[x\lhd u]t & \mapsto\_{\pi} (\vl v)[x\lhd u] \\ tv[x\lhd u] & \mapsto\_{\pi} (tv)[x\lhd u] & \qquad t[y\lhd v[x\lhd u]] \mapsto\_{\pi} t[y\lhd v][x\lhd u] \end{array}$$

The reduction relation →<sup>π</sup> is defined as the closure of the rules →<sup>π</sup> under *all* contexts. It does not hold any computational content, only a structural one that unblocks redexes by moving explicit cuts out.

In order to highlight the computational content of node replication we combine distance and permutations within the λR**-calculus**, given by the closure of the following rules by all the contexts.

$$\begin{array}{ll} \mathsf{L}\langle\lambda x.t\rangle u & \mapsto\_{\mathsf{dB}} \mathsf{L}\langle t[x\backslash u]\rangle\\ t[x\backslash\mathsf{L}\langle uv\rangle] & \mapsto\_{\mathsf{app}} \mathsf{L}\langle t\{x\backslash yz\}[y\backslash u][z\backslash v]\rangle \text{ where } y \text{ and } z \text{ are fresh} \\ t[x\backslash\mathsf{L}\langle\lambda y.u\rangle] & \mapsto\_{\mathsf{dist}} \mathsf{L}\langle t[x\backslash\lambda y.z[z\backslash u]]\rangle & \text{where } z \text{ is fresh} \\ t[x\backslash\lambda y.u] & \mapsto\_{\mathsf{adv}} \mathsf{L}\langle t\{x\backslash\lambda y.p\}\rangle & \text{where } u \to\_{\pi}^{\*} \mathsf{L}\langle p\rangle \text{ and } y \notin \mathsf{fv}(\mathsf{L})\\ t[x\backslash\mathsf{L}\langle y\rangle] & \mapsto\_{\mathsf{var}} \mathsf{L}\langle t\{x\backslash y\}\rangle & \\ \end{array}$$

Notice in the five rules above that the (meta-level) substitution is *full* (it is performed simultaneously on all free occurrences of the variable x), and the list context L is always pushed outside the term t. We will highlight in green such list contexts in the forthcoming examples to improve readability. Apart from rule dB used to fire β-reductions, there are four substitution rules used to copy abstractions, applications and variables, pushing outside all the cuts surrounding the node to be copied. Rule app copies one application node, while rule var copies one variable node. The case of abstractions is more involved as explained below.

The specificity in copying an abstraction λy.u is due to the (binding) relation between λy and all the free occurrences of y in its body u. Abstractions are thus copied in two stages. The first one is implemented by the rule dist, creating a distributor in which a potentially replaceable abstraction is placed, while moving its body inside a new ES. There are then two ways to replicate nodes of the body. Either they can be copied inside the distributor (where the binding relation between λy and the bound occurrences of y is kept intact), or they can be pushed outside the distributor, by means of the (non-deterministic) rule abs. In the second case, however, free occurrences of y cannot be pushed outside the abstraction (with binder y) to be duplicated, at the risk of breaking consistency: only shared components without y links can be then pushed outside. These components are gathered together into a list context L, which is pushed outside by using permutation rules, before performing the substitution of the pure body containing all the bound occurrences of y. Specifying this operation using only distance is hard, thus permutation rules are also used in our rule abs.

The s-substitution relation →<sup>s</sup> (resp. distant Beta relation →dB) is defined as the closure of →app ∪ 
→dist ∪ 
→abs ∪ 
→var (resp. →dB) under *all* contexts, and the reduction relation →<sup>R</sup> is the union of →<sup>s</sup> and →dB.

*Example 1.* Let t<sup>0</sup> = (λx1.x1)(λy.Iy). In what follows, we underline the term where the reduction is performed:

t<sup>0</sup> →dB x1[x1\λy.Iy] →dist x1[x1\\λy.z[z\Iy]]→app x1[x1\\λy.(z1z2)[z1\I][z2\y]] →dist x1[x1\\λy.(z1z2)[z1\\λx3.z3[z3\x3]][z2\y]] →var x1[x1\\λy.(z1y) [z1\\λx3.z3[z3\x3]] ] →abs (λy.z1y)[z1\\λx3.z3[z3\x3]]

Let R be any reduction relation. We write →<sup>∗</sup> <sup>R</sup> for the reflexive-transitive closure of →R. A term t is said to be R**-confluent** *iff* t →<sup>∗</sup> <sup>R</sup> <sup>u</sup> and <sup>t</sup> <sup>→</sup><sup>∗</sup> <sup>R</sup> <sup>s</sup> implies there is t such that u →<sup>∗</sup> R t and s →<sup>∗</sup> R t . The relation R is **confluent** *iff* every term is R-confluent. A term t is said to be in R**-normal form** (written also R-nf) *iff* there is no t such that t →<sup>R</sup> t . A term t is said to be R**terminating** or R**-normalizing** *iff* there is no infinite R-sequence starting at t. The reduction R is said to be **terminating** *iff* every term is R-terminating.

**Levels.** The notion of level plays a key role in this work. Intuitively, the level of a variable in a term indicates the maximal depth of its free occurrences w.r.t. ES (and not w.r.t. explicit distributors). However, in order to keep soundness w.r.t. the permutation rules, levels are computed along *linked chains* of ES. For instance, the level of w in both x[x\y[y\w]] and x[x\y][y\w] is 2. Formally, the **level** of a variable z in a term t is defined by (structural) induction, while assuming by α-conversion that z is not a bound variable in t:

$$\begin{aligned} \mathbf{1v}\_z(x) &:= 0 & \mathbf{1v}\_z(t\_1 t\_2) &:= \max(\mathbf{1v}\_z(t\_1), \mathbf{1v}\_z(t\_2)) & \mathbf{1v}\_z(\lambda y.t) &:= \mathbf{1v}\_z(t) \\ \mathbf{1v}\_z(t[x \diamond u]) &:= \begin{cases} \mathbf{1v}\_z(t) & \text{if } z \notin \mathbf{f} \mathbf{v}(u) \\ \max(\mathbf{1v}\_z(t), \mathbf{1v}\_x(t) + \mathbf{1v}\_z(u) + x(\sphericalangle)) & \text{otherwise} \end{cases} \end{aligned}$$

Notice that lvw(t) = 0 whenever w /∈ fv(t) or t is pure. We illustrate the concept of level by an example. Consider t = x[x\z[y\w]][w\w ], then lvz(t) = 1, lv<sup>w</sup>- (t) = 3 and lvy(t) = 0 because y /∈ fv(t). This notion is also extended to contexts as expected, *i.e.* lv✷(C) = lvz(Cz), where z is a fresh variable.

**Lemma 2.** *Let* t ∈ ΛR*. If* t<sup>0</sup> →π,<sup>s</sup> t1*, then* lvw(t0) ≥ lvw(t1) *for any* w ∈ X *.*

It is worth noticing that there are two cases when the level of a variable in a term may decrease: using a permutation rule to push an explicit cut out of another cut when the first one is a void cut, or using rule →var.

Hence, levels alone are not enough to prove termination of →s. We then define a decreasing measure for →<sup>s</sup> in which not only variables are indexed by a level, but also constructors. For instance, in t[x\λy.yz], we can consider that the level of *all* the constructors of λy.yz have level lvx(t). This will ensure that the level of an abstraction will decrease when applying rule dist, as well as the level of an application when applying rule app. This is what we do next.

### **3 Operational Properties**

We now prove three key properties of the λR-calculus: termination of the reduction system →s, relation between λR and the λ-calculus, and confluence of the reduction system →<sup>λ</sup>R.

**Termination of** →s**.** Some (rather informal) arguments are provided in [33] to justify termination of the substitution subrelation of their whole calculus. We expand these ideas into an alternative full formal proof adapted to our case, which is based on a measure being strictly decreasing w.r.t. →s.

We consider a set <sup>O</sup> of objects of the form <sup>a</sup>(k, n) or <sup>b</sup>(k) (k, n <sup>∈</sup> <sup>N</sup>), which is equipped with the following ordering >O:

a(k, n) ><sup>O</sup> a(k , n) if k>k , or (k = k and n>n ) b(k) ><sup>O</sup> a(k , n) if k ≥ k a(k, n) ><sup>O</sup> b(k ) if k>k b(k) ><sup>O</sup> b(k ) if k>k

**Lemma 3.** *The order* ><sup>O</sup> *on the set* O *is well-founded.*

We write ><sup>O</sup> MUL for the multiset extension of the order ><sup>O</sup> on O, which turns out to be well-founded [8] by Lem. 3. We are now ready to (inductively) define our **cuts level** measure C ( ) on terms, where the following operation on multisets is used p · M := [a(p + k, n) | a(k, n) ∈ M] [b(p + k) | b(k) ∈ M], where denotes multiset union.

$$\begin{array}{ll} \mathsf{C}\left(x\right) := \left[\right] & \mathsf{C}\left(\lambda x.t\right) := \mathsf{C}\left(t\right) & \mathsf{C}\left(tu\right) := \mathsf{C}\left(t\right) \sqcup \mathsf{C}\left(u\right) \\ \mathsf{C}\left(t\middle[x\left\Vert u\right]\right) := \mathsf{C}\left(t\right) \sqcup \left(\mathsf{1}\mathbf{v}\_{x}\left(t\right) + 1\right) \cdot \mathsf{C}\left(u\right) \sqcup \left[\mathsf{a}\left(\mathsf{1}\mathbf{v}\_{x}\left(t\right) + 1, \left\Vert u\right\Vert\right)\right] \\ \mathsf{C}\left(t\middle[x\left\Vert u\right]\right) := \mathsf{C}\left(t\right) \sqcup \mathsf{1}\mathbf{v}\_{x}\left(t\right) \cdot \mathsf{C}\left(u\right) \sqcup \left[\mathsf{b}\left(\mathsf{1}\mathbf{v}\_{x}\left(t\right)\right)\right] \end{array}$$

Intuitively, the integer k in a(k, n) and b(k) counts the level of variables bound by explicit cuts, while n counts the size of terms to be substituted by an ES. Remark that for every pure term p we have C (p) = [ ]. Moreover:

**Lemma 4.** *Let* t<sup>0</sup> ∈ ΛR*. Then* t<sup>0</sup> →<sup>π</sup> t<sup>1</sup> *(resp.* t<sup>0</sup> →<sup>s</sup> t1*) implies* C (t0) ≥<sup>O</sup> MUL C (t1) *(resp.* C (t0) ><sup>O</sup> MUL C (t1)*).*

As an example, consider the following reduction sequence:

$$\begin{array}{rcl} t\_0 = (yy)[y \backslash (\lambda z.x)w] & \rightarrow\_{\mathtt{app}} (y\_1 y\_2)(y\_1 y\_2)[y\_1 \backslash \lambda z.x][y\_2 \backslash w] = t\_1 \rightarrow\_{\mathtt{var}} \\ t\_2 = (y\_1 w)(y\_1 w)[y\_1 \backslash \lambda z.x] & \rightarrow\_{\mathtt{dir}} (y\_1 w)(y\_1 w)[y\_1 \backslash \lambda z.x][r \backslash x]] & = t\_3 \end{array}$$

We have C (t0)=[a(1, 4)], C (t1)=[a(1, 1), a(1, 2)], C (t2)=[a(1, 2)], C (t3) = [a(1, 1), b(0)]. So C (ti) >MUL C (t<sup>i</sup>+1) for i = 0, 1, 2, 3.

**Corollary 5.** *The reduction relation* →<sup>s</sup> *is terminating.*

**Simulations.** We show the relation between λR and the λ-calculus, as well as the atomic λ-calculus. For that, we introduce a projection from λR-terms to λ-terms implementing the unfolding of all the explicit cuts: x<sup>↓</sup> := x, (λx.t)<sup>↓</sup> := λx.t↓, (tu)<sup>↓</sup> := t <sup>↓</sup>u↓, (t[x&u])<sup>↓</sup> := t <sup>↓</sup>{x\u<sup>↓</sup>}. Thus *e.g.* x[x\z][y\w][w\w ] <sup>↓</sup> = z.

**Lemma 6.** *Let* t<sup>0</sup> ∈ ΛR*. If* t<sup>0</sup> →<sup>R</sup> t1*, then* t ↓ <sup>0</sup> →<sup>∗</sup> <sup>β</sup> t ↓ <sup>1</sup>*. In particular, if either* t<sup>0</sup> →<sup>π</sup> t<sup>1</sup> *or* t<sup>0</sup> →<sup>s</sup> t1*, then* t ↓ <sup>0</sup> = t ↓ 1*.*

The relation →<sup>s</sup> enjoys **full composition** on *pure* terms, namely, for any <sup>p</sup> <sup>∈</sup> <sup>Λ</sup>, <sup>t</sup>[x\p] <sup>→</sup><sup>+</sup> <sup>s</sup> t{x\p}. This property does not hold in general. Indeed, if t = xx, then (xx)[x\z[z\w]] does not s-reduce to (z[z\w])(z[z\w]), but to (zz)[z\w]. However, full composition restricted to pure terms is sufficient to prove simulation of the λ-calculus.

**Lemma 7 (Simulation of the** λ**-calculus).** *Let* p<sup>0</sup> ∈ Λ*. If* p<sup>0</sup> →<sup>β</sup> p1*, then* <sup>p</sup><sup>0</sup> <sup>→</sup>dB→<sup>+</sup> <sup>s</sup> p1*.*

The previous results have an important consequence relating the original atomic λ-calculus and the λR-calculus. Indeed, it can be shown that reduction in the atomic λ-calculus is captured by λR, and vice-versa. More precisely, the λR-calculus can be simulated into the atomic λ-calculus by Lem. 6 and [33], while the converse holds by [33] and Lem. 7.

A more structural correspondence between λR and the atomic λ-calculus could also be established. Indeed, λR can be first refined into a (non-linear) calculus *without* distance, let say λR , so that permutation rules are integrated in the intermediate calculus as independent rules. Then a structural relation can be established between λR and λR on one side, and λR and the atomic λ-calculus on the other side (as for example done in [43] for the λ-calculus).

**Confluence.** By Cor. 5 the reduction relation →<sup>s</sup> is terminating. It is then not difficult to conclude confluence of →<sup>s</sup> by using the unfolding function <sup>↓</sup>. Therefore, by termination of →<sup>s</sup> any t ∈ Λ<sup>R</sup> has an s-nf, and by confluence this s-nf is unique (and computed by the unfolding function). Using the interpretation method [35] together with Lem. 6, Cor. 5, and Lem. 7, one obtains:

**Theorem 8.** *The reduction relation* →<sup>R</sup> *is confluent.*

### **4 Encoding Evaluation Strategies**

In the theory of programming languages [56], the notion of *calculus* is usually based on a non-deterministic rewriting relation, providing an equational system of calculation, while the deterministic notion of *strategy* is associated to a concrete machinery being able to implement a specific evaluation procedure. Typical evaluation strategies are call-by-name, call-by-value, call-by-need, etc.

Although the atomic λ-calculus was introduced as a technical tool to implement full laziness, only its (non-deterministic) equational theories was studied. In this paper we bridge the gap between the theoretical presentation of the atomic λ-calculus and concrete specifications of evaluation strategies. Indeed, we use the λR-calculus to investigate two concrete cases: a call-by-name strategy implementing weak head reduction, based on full substitution, and the call-by-need fully lazy strategy, which uses linear substitution.

In both cases, explicit cuts can in principle be placed anywhere in the distributors, thus demanding to dive deep in such terms to deal with them. We then restrict the set of terms to a subset U, which simplifies the formal reasoning of explicit cuts inside distributors. Indeed, distributors will all be of the shape [x\\λy.Lp], where p is a pure term (and L is a *commutative list* defined below). We argue that this restriction is natural in a weak implementation of the λ-calculus: it is true on pure terms and is preserved through evaluation. We consider the following grammars.

$$\begin{array}{l} \text{(Linear Cut Values)} \quad \mathsf{T} ::= \lambda x. \mathsf{LL}\langle p \rangle \text{ where } y \in \mathsf{dLc(LL)} \implies |p|\_y = 1\\ \text{(Commutative Lists)} \text{ LL} ::= \sqcap \, \mathsf{LL}[x \nmid p] \mid \, \mathsf{LL}[x \nmid \mathsf{T}] \text{ where } |\mathsf{LL}|\_x = 0\\ \text{(Values)} & v & ::= \lambda x. p\\ \text{(Restricted Terms)} \quad \mathsf{U} ::= x \mid v \mid \mathsf{U} \mathsf{U} \mid \mathsf{U}[x \nmid \mathsf{U}] \, \mathsf{U}[x \nmid \mathsf{T}] \end{array}$$

A term t generated by any of the grammars G defined above is written t ∈ G. Thus *e.g.* λx.(yz)[y\I][z\I] ∈ T but λx.(yy)[y\I] ∈/ T, ✷[x\yz][x \I] ∈ LL but ✷[x\yz][y\I] ∈/ LL, and (yz)[y\\I] ∈ U but (yz)[y\\λx.(yy)[y\I]] ∈/ U.

The set T is stable by the relation →s, but U is clearly not stable under the whole →<sup>R</sup> relation, where dB-reductions may occur under abstractions. However, U is stable under both weak strategies to be defined: call-by-name and call-byneed. We factorize the proofs by proving stability for a more general relation →<sup>R</sup>- , defined as the relation →<sup>R</sup> with dB-reductions forbidden under abstractions and inside distributors.

#### **Lemma 9 (Stability of the Grammar by** →s**/**→<sup>R</sup>-**).**

*1. If* t ∈ T *and* t →<sup>s</sup> t *, then* t ∈ T*. 2. If* <sup>t</sup> <sup>∈</sup> <sup>U</sup> *and* <sup>t</sup> <sup>→</sup><sup>R</sup> t *, then* t <sup>∈</sup> <sup>U</sup>*.*

#### **4.1 Call-by-name**

The **call-by-name** (CBN) strategy →name (Fig. 1) is defined on the set of terms U as the union of the following relations →ndb and →ns. The strategy is *weak* as there is no reduction under abstractions. It is also worth noticing (as a particular case of Lem. 9) that t ∈ U and t →name t implies t ∈ U.

$$\begin{array}{cccc} t \stackrel{t \mapsto \mathsf{a}\mathsf{a}}{\longrightarrow} \mathsf{a} t' & \mathsf{(dB)} & t \stackrel{t \to \mathsf{a}\mathsf{a}\mathsf{a}}{\longrightarrow} t' \\\ t \stackrel{t \mapsto \mathsf{a}\mathsf{a}\mathsf{a}}{\longrightarrow} t' & \mathsf{(da)} & t \stackrel{t \to \mathsf{a}\mathsf{a}\mathsf{b}}{\longrightarrow} t' \end{array} \quad \text{(\mathsf{app.dB})} \quad \begin{array}{cccc} t \stackrel{t \to \mathsf{a}\mathsf{a}\mathsf{b}}{\longrightarrow} t' \\\ t \stackrel{t \to \mathsf{a}\mathsf{a}\mathsf{b}}{\longrightarrow} t' & \mathsf{(\mathsf{a}\mathsf{b}\mathsf{b}\mathsf{c})} \\\ tu \stackrel{t \to \mathsf{a}\mathsf{a}}{\longrightarrow} t'u & \mathsf{(\mathsf{app.s})} & \overline{u[x]\lambda y.t] \longrightarrow} \mathsf{a} u[x] \lambda y.t' \end{array} \quad \text{(\mathsf{sub.s})}$$
 
$$\begin{array}{cccc} \mathbf{\mathsf{Fig.}} \mathbf{1.} \text{ Call-by-Name Strategory}$$

*Example 10.* Let t<sup>0</sup> = (λx1.I(x1I))(λy.Iy). Then,

$$\begin{array}{lcl} t\_{0} \rightarrow\_{\mathsf{dB}} \stackrel{\scriptstyle \mathsf{(}\mathsf{T}(x\_{1}\mathsf{T}))[x\_{1}\backslash\lambda y.\mathsf{T}y]}{\displaystyle\vdash 1\ \mathsf{(}x\_{1}\mathsf{T})[x\_{1}\backslash\lambda y.\mathsf{(}\mathsf{T}(x\_{1}\mathsf{T}))[x\_{1}\backslash\lambda y.\mathsf{(}\mathsf{T}(x\_{1}\mathsf{T}y)]]\rightarrow \stackrel{\scriptstyle \mathsf{(}\mathsf{Sub},\mathsf{a})}{\displaystyle\mathsf{APP}}\\ \mathsf{(}\mathsf{(}\mathsf{T}(x\_{1}\mathsf{T}))[x\_{1}\backslash\lambda y.\mathsf{(}\underline{(}x\_{1}\mathsf{T})[x\_{1}\backslash\mathsf{T}][z\_{2}\backslash y]]\rightarrow \stackrel{\scriptstyle \mathsf{(}\mathsf{Sub},\mathsf{a})}{\mathsf{var}}\quad\mathsf{(}\mathsf{T}(x\_{1}\mathsf{T}))[x\_{1}\backslash\lambda y.\mathsf{(}\mathsf{(}x\_{1}\mathsf{y})]\box{[x\_{1}\backslash\mathsf{T}]}\rightarrow \stackrel{\scriptstyle \mathsf{(}\mathsf{a})}{\mathsf{a}\mathbf{b}}\\ \mathsf{(}\mathsf{T}((\lambda y.z\_{1}y)\mathtt{T}))[z\_{1}\backslash\mathsf{T}]\rightarrow \stackrel{\scriptstyle \mathsf{(}\mathsf{Sub},\mathsf{a}\mathsf{B})}{\mathsf{a}\mathbf{b}}\;\mathsf{x\_{2}}\,\mathsf{[}x\_{2}\backslash(\lambda y.z\_{1}y)\mathtt{T}]\mathsf{[}z\_{1}\backslash\mathsf{T}]\end{array}$$

Although the strategy →name is not deterministic, it enjoys the remarkable *diamond* property, guaranteeing in particular that all reduction sequences starting from t and ending in a normal form have the same length.

It is worth noticing that simulation lemmas also hold between call-by-name in the λ-calculus, known as weak head reduction and denoted by →whr, and the λR-calculus. Indeed, →whr is defined as the β-reduction rule closed by contexts E ::= ✷ | E t. Then, as a consequence of Lem. 7, we have that p<sup>0</sup> →whr p<sup>1</sup> implies p<sup>0</sup> →<sup>∗</sup> <sup>R</sup> p1, and as a consequence of Lem. 6, we have that t<sup>0</sup> →name t<sup>1</sup> implies t ↓ <sup>0</sup> →<sup>∗</sup> <sup>β</sup> t ↓ <sup>1</sup>. More importantly, call-by-name in the λ-calculus and call-by-name in the λR-calculus are also related. Indeed,

#### **Lemma 11 (Relating Call-by-Name Strategies).**

$$\begin{array}{l} \text{--} \quad \text{Let } p\_0 \in A. \text{ If } p\_0 \to\_{\mathsf{whr}} p\_1 \,\, then \, p\_0 \to\_{\mathsf{nano}}^+ p\_1. \\\text{--} \quad \text{Let } t\_0 \in \mathcal{U}. \, If \, t\_0 \to\_{\mathsf{nano}} t\_1 \,\, then \, t\_0^\downarrow \to\_{\mathsf{whr}}^\* t\_1^\downarrow. \end{array}$$

#### **4.2 Call-by-need**

We now specify a deterministic strategy flneed implementing demand-driven computations and only linearly replicating nodes of *values* (*i.e.* pure abstractions). Given a value λx.p, only the piece of structure containing the paths between the binder λx and all the free occurrences of x in p, named *skeleton*, will be copied. All the other components of the abstraction will remain shared, thus avoiding some future duplications of redexes, as explained in the introduction. By copying only the smallest possible substructure of the abstraction, the strategy flneed implements an optimization of call-by-need called f*ully* l*azy sharing* [60]. First, we formally define the key notions we are going to use.

A **free expression** [39,9] of a *pure* term p is a strict subterm q of p such that every free occurrence of a variable in q is also a free occurrence of the variable in p. A **free expression** of p is **maximal** if it is not a subterm of another free expression of p. From now on, we will consider the multiset of all maximal free expressions (**MFE**) of a term. Thus *e.g.* the MFEs of λy.p, where p = (Iy)I(λz.zyw), is given by the multiset [I, I, w].

An n**-ary context** (n ≥ 0) is a term with n holes ✷. A skeleton is an nary pure context where the maximal free expressions w.r.t. a variable set θ are replaced with holes. Formally, the <sup>θ</sup>**-skeleton** {{p}}<sup>θ</sup> of a pure term <sup>p</sup>, where <sup>θ</sup> <sup>=</sup> {x<sup>1</sup> ...x<sup>n</sup>}, is the n-ary pure context {{p}}<sup>θ</sup> such that {{p}}<sup>θ</sup><sup>q</sup>1,...,q<sup>n</sup> <sup>=</sup> <sup>p</sup>, for [q1,...,qn] the maximal free expressions of λx1. . . . λxn.p <sup>4</sup>. Thus, for the same <sup>p</sup> as before, λy.{{p}}<sup>y</sup> <sup>=</sup> λy.(✷y)✷(λz.zy✷).

**The Splitting Operation.** Splitting a term into a skeleton and a multiset of MFEs is at the core of full laziness. This can naturally be implemented in the node replication model, as observed in [33]. Here, we define a (small-step) strategy →st on the set of terms T to achieve it (Fig. 2), which is indeed a subset of the reduction relation <sup>λ</sup>R<sup>5</sup>. The relation <sup>→</sup>st makes use of four basic rules which are parameterized by the variable y upon which the skeleton is built, written <sup>→</sup><sup>y</sup>. There are also two contextual (inductive) rules.


*Example 12.* Let y, z /∈ fv(t), so that t is the MFE of λy.x[x\λz.(yt)z]. Then,

$$\begin{array}{l} \lambda y.\underline{x}[x\backslash\lambda z.(yt)z] \rightarrow \stackrel{y}{\mathtt{idz}}\_{\mathtt{idz}}\lambda y.x[x\backslash\lambda z.\underline{w}[w\backslash(yt)z]] \rightarrow \stackrel{z}{\mathtt{app}}\\ \lambda y.x[x\backslash\lambda z.(w\_{1}w\_{2})[w\_{1}\backslash yt][w\_{2}\backslash z]] \rightarrow \stackrel{z}{\mathtt{var}}\_{\mathtt{var}}\lambda y.x[x\backslash\lambda z.(w\_{1}z)[w\_{1}\backslash yt]] \rightarrow \stackrel{y}{\mathtt{ABz}}\\ \lambda y.\underline{(\lambda z.w\_{1}z)[w\_{1}\backslash yt]} \rightarrow \stackrel{y}{\mathtt{app}}\lambda y.(\underline{\lambda z.(x\_{1}x\_{2})z})[\overline{x\_{1}\backslash y}][x\_{2}\backslash t] \rightarrow \stackrel{y}{\mathtt{var}}\_{\mathtt{var}}\lambda y.(\underline{\lambda z.(yx\_{2})z})[x\_{2}\backslash t] \end{array}$$

Notice that the focused variable changes from y to z, then back to y. This is because →st constructs the innermost skeletons first.

**Lemma 13.** *The reduction relation* →st *is confluent and terminating.*

Thus, from now on, we denote by ⇓st the function relating a term of T to its unique st-nf.

<sup>4</sup> The order of variables in the set θ is indeed irrelevant.

<sup>5</sup> Since <sup>→</sup>st acts only on terms in <sup>T</sup>, it is handled by linear substitution.

**Lemma 14 (Correctness of** →st**).** *Let* p ∈ Λ *and* q1,...,q<sup>n</sup> *be the MFEs of* λy.p*. Then* λy.z[z\p] ⇓st λy.{{p}}{y}<sup>x</sup>1,...,x<sup>n</sup>[x<sup>i</sup>\qi]<sup>i</sup>≤<sup>n</sup> *where the variables* x1,...,x<sup>n</sup> *are fresh and pairwise distinct.*

Since the small-step semantics is contained in the λR-calculus, we use it to build our call-by-need strategy of λR.

**The strategy.** The **call-by-need strategy** →flneed (Fig. 3) is defined on the set of terms U, by using closure under the *need contexts*, given by the grammar N ::= ✷ | Nt | N[x&t] | Nx[x\N], where N  denotes capture-free application of contexts (Sec. 2). As for call-by-name (Sec. 4.1), the call-by-need strategy is *weak*, because no *meaningful* reduction steps are performed under abstractions.

Lλx.pu -→dB Lp[x\u] Nx[x\Lλy.p] -→spl LLLNx[x\\λy.p- ] if λy.z[z\p] ⇓st λy.LLp- Nx[x\\v] -→sub Nv[x\\v] **Fig. 3.** Call-by-Need Strategy

Rule dB is the same one used to define name. Although rules spl and sub could have been presented in a unique rule of the form Nx[x\Lλy.p] → LLLNλy.p [x\\λy.p ], we prefer to keep them separate since they represent different stages in the strategy. Indeed, rule spl only uses node replication operations to compute the skeleton of the abstraction, while rule sub implements one-shot *linear* substitution.

Notice that as a particular case of Lem. 9, t ∈ U and t →flneed t implies t ∈ U. Another interesting property is that t →sub t implies lvz(t) ≥ lvz(t ). Moreover, →flneed is deterministic.

*Example 15.* Let t<sup>0</sup> = (λx.(I(Ix)))λy.yI. Needed variable occurrences are highlighted in orange .

t<sup>0</sup> →dB (I(Ix))[x\λy.yI] →dB x<sup>1</sup> [x1\Ix][x\λy.yI] →dB x1[x1\x2[x2\ x ]][x\λy.yI] →spl x1[x1\x2[x2\ x ]][x\\λy.yz1][z1\I] →sub x1[x1\ x<sup>2</sup> [x2\λy.yz1]][x\\λy.yz1][z1\I] →spl x1[x1\ x<sup>2</sup> [x2\\λy.yz2][z2\z1]][x\\λy.yz1][z1\I] →sub x<sup>1</sup> [x1\(λy.yz2) [x2\\λy.yz2][z2\z1] ][x\\λy.yz1][z1\I] →spl x<sup>1</sup> [x1\\λy.yz3][z3\z2][x2\\λy.yz2][z2\z1][x\\λy.yz1][z1\I] →sub (λy.yz3)[x1\\λy.yz3][z3\z2][x2\\λy.yz2][z2\z1][x\\λy.yz1][z1\I]

### **5 A Type System for the** *λ*R**-calculus**

This section introduces a quantitative type system V for the λR-calculus. Nonidempotent intersection [26] has one main advantage over the idempotent model

[14]: it gives *quantitative* information about the length of reduction sequences to normal forms [21]. Indeed, not only typability and normalization can be proved to be equivalent, but a measure based on type derivations provides an *upper bound* to normalizing reduction sequences. This was extensively investigated in different logical/computational frameworks [5,18,20,25,42,47]. However, no quantitative result based on types exists in the literature for the node replication model, including the attempts done for deep inference [30]. The typing rules of our system are in themselves not surprising (see [46]), but they provide a handy quantitative characterization of fully lazy normalization (Sec. 6).

Types are built on the following grammar of types and multi-types, where α ranges over a set of base types and a is a special type constant used to type terms reducing to normal abstractions.

$$\{\text{Types}\}\,\sigma := \mathbf{a} \mid \alpha \mid \mathcal{M} \to \sigma \qquad \text{(Multi-Types)}\,\mathcal{M} := [\sigma\_i]\_{i \in I}$$

We write |M| to denote the **size of a multi-type** M. **Typing contexts**, written Γ, Δ, Σ are functions from variables to multiset types, assigning the empty multiset to all but a finite set of variables. The domain of Γ is given by dom(Γ) := {x | Γ(x) =[]}. The **union of contexts**, written Γ + Δ, is defined by (Γ + Δ)(x) := Γ(x) Δ(x), where denotes multiset union. An example is (x : [σ], y : [τ ]) + (x : [σ], z : [τ ]) = (x : [σ, σ], y : [τ ], z : [τ ]). This notion is extended to several contexts as expected, so that +<sup>i</sup>∈<sup>I</sup>Γ<sup>i</sup> denotes a finite union of contexts, and the empty context when I = ∅. We write Γ; Δ for Γ + Δ when dom(Γ) ∩ dom(Δ) = ∅. **Type judgments** have the form Γ t : σ, where Γ is a typing context, t is a term and σ is a type.


A **(typing) derivation** is a tree obtained by applying the (inductive) typing rules of system V (Fig. 4), introduced in [46]. The notation Φ ✄ Γ t : σ means there is a derivation named Φ of the judgment Γ t : σ in system V. A term t is typable in system V, or V-typable, iff there is a context Γ and a type σ such that Φ ✄ Γ t : σ. The **size of a type derivation** sz(Φ) is defined as the number of its abs, app and ans rules. The typing system is **relevant** in the sense that Φ ✄ Γ t : σ implies dom(Γ) ⊆ fv(t).

Type derivations can be measured by 3-tuples. We use a + operation on 3-tuples as pointwise addition: (a, b, c)+(e, f, g)=(a + e, b + f, c + g). These 3 tuples are computed by a **weighted derivation level** function defined on typing derivations as D (Φ) := M (Φ, 1), where M (−, −) is inductively defined below. In the cases (abs), (app) and (cut), we let Φ<sup>t</sup> (resp. Φu) be the subderivation of the type of t (resp. Φu) and in (man ) we let Φ<sup>i</sup> <sup>t</sup> be the i-th derivation of the type of t for each i ∈ I.


Notice that the first and the third components of any 3-tuple M (Φ, m) do not depend on m. Intuitively, the first (resp. third) component of the 3-tuple counts the number of application/abstraction (resp. (ax)) rules in the typing derivation. The second one takes into account the number of application/abstraction rules as well, but *weighted* by the level of the constructor. The 3-tuples are ordered lexicographically.

.

*Example 16.* Let σ = [τ ] → τ . Consider the following type derivation Φ:

$$\begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \end{array} \end{array} \end{array} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \end{array} \end{array} \end{array} \end{array} \begin{array}{c} \begin{array}{c} \begin{array}{c} \text{(ax)} \\ \end{array} \end{array} \end{array} \end{array} \end{ } \begin{array}{c} \begin{array}{c} \begin{array}{c} \text{(ax)} \\ \end{array} \end{array} \end{array} \end{pmatrix} \begin{array}{c} \begin{array}{c} \begin{array}{c} \text{(ax)} \\ \end{array} \end{array} \end{pmatrix}}{\begin{array}{c} y: [\sigma], z: [\tau] \vdash z: [\tau] \end{array} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \tau \end{array} \end{array} \end{array} \end{pmatrix} \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \tau \end{array} \end{array} \end{pmatrix} \begin{array}{c} \begin{array}{c} \begin{array}{c} \tau \end{array} \end{array} \end{pmatrix} \begin{array}{c} \begin{array}{c} \begin{array}{c} \tau \end{array} \end{array} \end{array} \end{pmatrix}} \begin{array}{c} \begin{array}{c} \begin{array}{c} \tau \end{array} \end{array} \end{pmatrix} \begin{array}{c} \begin{array}{c} \tau \end{array} \end{array} \end{pmatrix} \begin{array}{c} \begin{array}{c} \tau \end{array} \end{array} \end{aligned}$$

This gives D (Φ) = (1, 2, 3). Moreover, for x[x\yz] →app (x1x2)[x1\y][x2\z] we have Φ ✄ y : [σ], z : [τ ] (x1x2)[x1\y][x2\z] : τ and D (Φ ) = (1, 1, 4).

### **6 Observational Equivalence**

The type system V characterizes normalization of both name and flneed strategies as follows: every typable term normalizes and every normalisable term is typable. In this sense, system V can be seen as a (quantitative) *model* [17] of our call-by-name and call-by-need strategies. We prove these results by studying the appropriate lemmas, notably weighted subject reduction and weighted subject expansion. We then deduce observational equivalence between the name and the flneed strategies from the fact that their associated normalization properties are both fully characterized by the same typing system.

**Soundness.** Soundness of system V w.r.t. both →name and →flneed is investigated in this section. More precisely, we show that typable terms are normalizing for both strategies. In contrast to reducibility techniques needed to show this kind of result for simple types [34], soundness is achieved here by relatively simple combinatorial arguments based again on decreasing measures. We start by studying the interaction between system V and linear as well as full substitution. **Lemma 17 (Partial Substitution).** *Let* Φ ✄ Γ; x : M Cx : σ *and denote multiset inclusion. Then, there exists* N M *such that for every* Φ<sup>u</sup> ✄ <sup>Δ</sup> <sup>u</sup> : <sup>N</sup> *we have* <sup>Ψ</sup> ✄ <sup>Γ</sup> <sup>+</sup> <sup>Δ</sup>; <sup>x</sup> : M\N <sup>C</sup><sup>u</sup> : <sup>σ</sup> *and, for every* <sup>m</sup> <sup>∈</sup> <sup>N</sup>*,* M (Ψ,m) = M (Φ, m) + M (Φu, m + lv✷(C)) − (0, 0, |N |)*.*

**Corollary 18 (Substitution).** *If* Φ<sup>t</sup> ✄ Γ; x : M t : σ *and* Φ<sup>u</sup> ✄ Δ u : M*, then* <sup>Φ</sup>✄<sup>Γ</sup> <sup>+</sup> <sup>Δ</sup> <sup>t</sup>{x\u} : <sup>σ</sup>*, and for all* <sup>m</sup> <sup>∈</sup> <sup>N</sup> *we have* <sup>M</sup> (Φ, m) <sup>≤</sup> <sup>M</sup> (Φt, m)+ M (Φu, m + lvx(t))*. Moreover,* |M| > 0 *iff the inequality is strict.*

The key idea to show soundness is that the measure D ( ) decreases w.r.t. the reduction relations →name and →flneed:

**Lemma 19 (Weighted Subject Reduction).** *Let* Φ<sup>t</sup><sup>0</sup> ✄ Γ t<sup>0</sup> : σ*.*

*1. If* t<sup>0</sup> →<sup>π</sup> t1*, then there exists* Φ<sup>t</sup><sup>1</sup> ✄ Γ t<sup>1</sup> : σ *such that* D (Φ<sup>t</sup><sup>0</sup> ) = D (Φ<sup>t</sup><sup>1</sup> )*.*

*2. If* t<sup>0</sup> →<sup>s</sup> t1*, then there exists* Φ<sup>t</sup><sup>1</sup> ✄ Γ t<sup>1</sup> : σ *such that* D (Φ<sup>t</sup><sup>0</sup> ) ≥ D (Φ<sup>t</sup><sup>1</sup> )*.*

*3. If* t<sup>0</sup> →ndb t1*, then there exists* Φ<sup>t</sup><sup>1</sup> ✄ Γ t<sup>1</sup> : σ *such that* D (Φ<sup>t</sup><sup>0</sup> ) > D (Φ<sup>t</sup><sup>1</sup> )*.*

*4. If* t<sup>0</sup> →flneed t1*, then there exists* Φ<sup>t</sup><sup>1</sup> ✄Γ t<sup>1</sup> : σ *such that* D (Φ<sup>t</sup><sup>0</sup> ) > D (Φ<sup>t</sup><sup>1</sup> )*.*

*Proof.* By induction on r ∈ {π, s, ndb, flneed}, using Lem. 17 and Cor. 18.

**Theorem 20 (Typability implies** name**-Normalization).** *Let* Φt✄Γ t : σ*. Then* t *is* name*-normalizing.*

*Proof.* Suppose t is not name-normalizing. Since →<sup>s</sup> is terminating by Cor. 5, then every infinite →name-reduction sequence starting at t must necessarily have an infinite number of dB-steps. Moreover, all terms in such an infinite sequence are typed by Lem 19. Therefore, Lem. 19:3 (resp. Lem. 19:2) guarantees that all dB (resp. s) reduction steps involved in such →name-reduction sequence strictly decrease (resp. do not increase) the measure D ( ). This leads to a contradiction because the order > on 3-tuples D ( ) is well-founded. Then t is necessarily namenormalizing.

**Theorem 21 (Typability implies** flneed**-Normalization).** *Let* Φ<sup>t</sup> ✄ Γ t : σ*. Then* t *is* flneed*-normalizing. Moreover,* D (Φt) *is an upper bound to the length of the* flneed*-reduction evaluation to* flneed*-nf.*

*Proof.* The property trivially holds by Lem. 19:4 since the lexicographic order on 3-tuples is well-founded.

**Completeness.** We address here completeness of system V with respect to →name and →flneed. More precisely, we show that normalizing terms in each strategy are typable. The basic property in showing that consists in guaranteeing that normal forms are typable.

The following lemma makes use of a notion of **needed variable**: nv(x) := {x}, nv(tu) := nv(t), nv(t[x\\u]) := nv(t), nv(λx.t) := ∅, nv(t[y\u]) := (nv(t) \ {y}) ∪ nv(u) if y ∈ nv(t) and nv(t[y\u]) := nv(t) otherwise.

**Lemma 22 (**flneed**-nfs are Typable).** *Let* t *be in* flneed*-nf. Then there exists a derivation* Φ ✄ Γ t : τ *such that for any* x /∈ nv(t)*,* Γ(x)=[ ]*.*

Because name-nfs are also flneed-nfs, we infer the following corollary for free.

**Corollary 23 (**name**-nfs are Typable).** *Let* t *be in* name*-nf. Then there is a derivation* Φ ✄ Γ t : τ *.*

Now we need lemmas stating the behavior of partial and full (anti-)substitution w.r.t. typing.

**Lemma 24 (Partial Anti-Substitution).** *Let* Cx*,* u *be terms s.t.* x /∈ fv(u) *and* Φ ✄ Γ Cu : σ*. Then* ∃Γ *,* ∃Δ*,* ∃M*,* ∃Φ *,* ∃Φ<sup>u</sup> *s.t.* Γ = Γ + Δ*,* Φ ✄ Γ + x : M Cx : σ *and* Φ<sup>u</sup> ✄ Δ u : M*.*

**Corollary 25 (Anti-Substitution).** *Let* u *be a term s.t.* x /∈ fv(u) *and* Φ✄Γ t{x\u} : σ*. Then* ∃Γ *,* ∃Δ*,* ∃M*,* ∃Φ *,* ∃Φ<sup>u</sup> *s.t.* Γ = Γ +Δ*,* Φ ✄Γ ; x : M t : σ *and* Φ<sup>u</sup> ✄ Δ u : M*.*

To achieve completeness, we show that typing is preserved by anti-reduction. We decompose the property as follows:

**Lemma 26 (Subject Expansion).** *Let* Φ<sup>t</sup><sup>1</sup> ✄ Γ t<sup>1</sup> : σ*. If* t<sup>0</sup> →<sup>r</sup> t1*, where* <sup>r</sup> ∈ {π, <sup>s</sup>, ndb, flneed}*, then there exists* <sup>Φ</sup><sup>t</sup><sup>0</sup> ✄ <sup>Γ</sup> <sup>t</sup><sup>0</sup> : <sup>σ</sup>*.*

*Proof.* The proof is by induction on →<sup>r</sup> and uses Lem. 24 and Cor. 25.

**Theorem 27 (**name**-Normalization implies Typability).** *Let* t *be a term. If* t *is* name*-normalizing, then* t *is* V*-typable.*

*Proof.* Let <sup>t</sup> be name-normalizing. Then <sup>t</sup> <sup>→</sup><sup>n</sup> name u and u is a name-nf. We reason by induction on n. If n = 0, then t = u is typable by Cor. 23. Otherwise, we have t →name t <sup>→</sup><sup>n</sup>−<sup>1</sup> name <sup>u</sup>. By the *i.h.* <sup>t</sup> is typable and thus by Lem. 26 (because →ns is included in →s), t turns out to be also typable.

**Theorem 28 (**flneed**-Normalization implies Typability).** *Let* t *be a term. If* t *is* flneed*-normalizing, then* t *is* V*-typable.*

*Proof.* Similar to the previous proof but using Lem. 22 instead of Cor. 23.

Summing up, Thms. 20, 27, 21 and 28 give:

**Theorem 29.** *Let* t *be a* λR*-term.* t *is* name*-normalizing iff* t *is* flneed*-normalizing iff* t *is* V*-typable.*

All the technical tools are now available to conclude observational equivalence between our two evaluation strategies based on node replication. Let R be any reduction notion on ΛR. Then, two terms t, u ∈ Λ<sup>R</sup> are said to be R**-observationally equivalent**, written t ≡ u, if for any context C, Ct is R-normalizing *iff* Cu is R-normalizing.

**Theorem 30.** *For all terms* t, u ∈ ΛR*,* t *and* u *are* name*-observationally equivalent iff* t *and* u *are* flneed*-observationally equivalent.*

*Proof.* By Thm. 29, t ≡name u means that Ct is V-typable *iff* Cu is V-typable, for all C. By the same theorem, this is also equivalent to say that Ct is flneednormalizing *iff* Cu is flneed-normalizing for any C, *i.e.* t ≡flneed u.

#### **7 Related Works and Conclusion**

Several calculi with ES bridge the gap between formal higher-order calculi and concrete implementations of programming languages (see a survey in [40]). The first of such calculi, *e.g.* [1,16], were all based on *structural* substitution, in the sense that the ES operator is syntactically propagated step-by-step through the term structure until a variable is reached, when the substitution finally takes place. The correspondence between ES and Linear Logic Proof-Nets [24] led to the more recent notion of calculi *at a distance* [6,4,2], enlightening a natural and new application of the Curry-Howard interpretation. These calculi implement linear/partial substitution *at a distance*, where the search of variable occurrences is abstracted out with context-based rewriting rules, and thus no ES propagation rules are necessary. A third model was introduced by the seminal work of Gundersen, Heijltjes, and Parigot [33,34], introducing the atomic λ-calculus to implement node replication.

Inspired by the last approach we introduced the λR-calculus, capturing the essence of node replication. In contrast to [33], we work with an implicit (structural) mechanism of weakening and contraction, a design choice which aims at focusing and highlighting the node replication model, which is the core of our calculus, so that we obtain a rather simple and natural formalism used in particular to specify evaluation strategies. Indeed, besides the proof of the main operational meta-level properties of our calculus (confluence, termination of the substitution calculus, simulations), we use linear and non-linear versions of λR to specify evaluation strategies based on node replication, namely call-by-name and call-by-need evaluation strategies.

The first description of call-by-need was given by Wadsworth [60], where reduction is performed on *graphs* instead of terms. Weak call-by-need on *terms* was then introduced by Ariola and Felleisen [7], and by Maraist, Odersky and Wadler [54,53]. Reformulations were introduced by Accattoli, Barenbaum and Mazza [3] and by Chang and Felleisen [22]. Our call-by-need strategy is inspired by the calculus in [3], which uses the distance paradigm [6] to gather together meaningful and permutation rules, by clearly separating *multiplicative* from *exponential* rules, in the sense of Linear Logic [27].

Full laziness has been formalized in different ways. Pointer graphs [60,59] are DAGs allowing for an elegant representation of sharing. Labeled calculi [15] implement pointer graphs by adding annotations to λ-terms, which makes the syntax more difficult to handle. Lambda-lifting [38,39] implements full laziness by resorting to translations from λ-terms to supercombinators. In contrast to all the previous formalisms, our calculus is defined on standard λ-terms with explicit cuts, without the use of any complementary syntactical tool. So is Ariola and Felleisen's call-by-need [7], however, their notion of full laziness relies on external (ad-hoc) meta-level operations used to extract the skeleton. Our specification of call-by-need enjoys fully lazy sharing, where the skeleton extraction operation is internally encoded in the term calculus operational semantics. Last but not least, our calculus has strong links with proof-theory, notably deep inference.

Balabonski [10,9] relates many formalisms of full laziness and shows that they are equivalent when considering the number of β-steps to a normal form. It would then be interesting to understand if his unified approach, (abstractly) stated by means of the theory of residuals [50,51], applies to our own strategy.

We have also studied the calculus from a semantical point of view, by means of intersection types. Indeed, the type system can be seen as a model of our implementations of call-by-name and call-by-need, in the sense that typability and normalization turn out to be equivalent.

Intersection types go back to [23] and have been used to provide characterizations of qualitative [14] as well as quantitative [21] models of the λcalculus, where typability and normalization coincide. Quantitative models specified by means of non-idempotent types [26,48] were first applied to the λcalculus (see a survey in [19]) and to several other formalisms ever since, such as call-by-value [25,20], call-by-need [42,5], call-by-push-value [31,18] and classical logic [47]. In the present work, we achieve for the first time a quantitative characterization of fully lazy normalization, which provides upper bounds for the length of reduction sequences to normal forms.

The characterizations provided by intersection type systems sometimes lead to observational equivalence results (*e.g.* [42]). In this work we succeed to prove observational equivalence related to a fully lazy implementation of weak call-byneed, a result which would be extremely involved to prove by means of syntactical tools of rewriting, as done for weak call-by-need in [7]. Moreover, our result implies that our node replication implementation of full laziness is observationally equivalent to standard call-by-name and to weak call-by-need (see [42]), as well as to the more semantical notion of neededness (see [45]).

A Curry-Howard interpretation of the logical *switch* rule of deep inference is given in [58,57] as an end-of-scope operator, thus introducing the *spinal atomic* λ*calculus*. The calculus implements a refined optimization of call-by-need, where only the *spine* of the abstraction (tighter than the skeleton) is duplicated. It would be interesting to adapt the λR-calculus to spine duplication by means of an appropriate end-of-scope operator, such as the one in [37]. Further optimizations might also be considered.

Finally, this paper only considers weak evaluation strategies, *i.e.* with reductions forbidden under abstractions, but it would be interesting to extend our notions to full (strong) evaluations too [29,12]. Extending full laziness to classical logic would be another interesting research direction, possibly taking preliminary ideas from [36]. We would also like to investigate (quantitative) *tight* types for our fully lazy strategy, as done for weak call-by-need in [5], which does not seem evident in our node replication framework.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Nondeterministic and co-Nondeterministic Implies Deterministic, for Data Languages**

Bartek Klin†, Slawomir Lasota‡, and Szymon Toru´nczyk§(-)

> University of Warsaw, Warsaw, Poland {klin,sl,szymtor}@mimuw.edu.pl

**Abstract.** We prove that if a data language and its complement are both recognized by nondeterministic register automata (without guessing), then they are also recognized by deterministic ones.

**Keywords:** Data languages, register automata, determinizability, deterministic separability, sets with atoms, orbit-finite sets, nominal sets

### **1 Introduction**

Register automata are finite-state automata equipped with a finite number of registers that can store values from an infinite data domain. When processing an input string, an automaton compares the current input data value to its registers and, based on this comparison and on the current control state, it chooses its next control state and possibly stores the input value in one of its registers. In the original model, introduced over 25 year ago by Francez and Kaminski [15], data values can only be compared for equality and not for any other property. Subsequent extensions of the model allow for comparing data values with respect to some fixed relations such as a total order, or introduce alternation, variations on the allowed form of nondeterminism, etc.

It appears that register automata lack most of the good properties known from the classical theory of finite automata. For example, while languages of nondeterministic register automata are closed under unions and intersections, they are not closed under complement, and they do not determinize. Moreover, the expressivity of register automata is very sensitive to natural variants and extensions. Any of the following relaxations of the model leads to a strict increase of expressive power (see [15,23,1] for details):



<sup>†</sup> Supported by the European Research Council (ERC) under the EU Horizon 2020 programme (ERC consolidator grant LIPA, agreement no. 683080).

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 365–384, 2021. https://doi.org/10.1007/978-3-030-71995-1 19

**–** adding the capability to nondeterministically guess data values.

In fact, almost every combination of these extensions leads to a different class of recognized languages. Furthermore, no satisfactory characterizations of languages of register automata in terms of regular expressions [17,20] or logic [23,12] are known. There are a few positive results: a simulation of two-way nondeterministic automata by one-way alternating automata with guessing [1], a Myhill-Nerode characterization of languages of deterministic automata [16,4,5], and the well-behaved class of languages definable by orbit-finite monoids [2], which admits equivalent characterisations in terms of logic [11] and a syntactic subclass of deterministic automata [7]. Nevertheless, register automata satisfy almost no semantic equivalences that hold for classical finite automata.

*Contribution.* Our primary contribution is a collapse result: if a language and its complement are both recognized by nondeterministic register automata (NRA), then they are both recognized by deterministic ones (DRA). In symbols, we prove the following equality of language classes:

$$\mathsf{NRA} \cap \mathsf{co-NRA} = \mathsf{DRA}.$$

This result is shown under the assumption that the data values can be compared only for equality, and it turns out to be quite fragile. For instance, it fails if the automata can compare data values using a total order relation. It also fails if NRA are additionally equipped with the capability of guessing fresh data values, even when data values can only be compared for equality.

Our secondary contribution is a collapse result for NRA with 1 register only (1-NRA), but over an arbitrary data domain that *admits well quasi-order* (wqo), meaning roughly that finite induced substructures of the data domain, ordered by embeddings, form a wqo. This includes both equality and ordered data domains. In short, we prove the following inclusion of language classes:

$$\begin{array}{rcl} \text{1-NRA} & \cap & \mathbf{co-1-NRA} & \subseteq & \text{DRA} . \end{array}$$

The inclusion is strict, as some DRA languages are not recognizable by 1-NRA.

Our proofs are mostly self-contained, but use basic notions and results about sets with atoms [1], also known as nominal sets [24]. In particular, automorphisms of the data domain play a central role in our arguments, and we extensively use notions such as finite support and orbit-finiteness of sets. In both results, we prove that for every data language L ∈ NRA ∩ co-NRA the set of derivative languages w−<sup>1</sup>L is orbit-finite, i.e., finite up to automorphism of data values. The collapse then follows from an orbit-finite version of the Myhill-Nerode theorem.

In our primary contribution, orbit-finiteness of the set of derivative languages is a consequence of a key technical result (Lem. 1), an abstract observation about orbit-finite families of sets, which we believe may be of independent interest. As another example application of this lemma, we give a new proof of decidability of universality for unambiguous register automata (URA).

*Relation to other work.* Our primary result partially confirms a conjecture of Thomas Colcombet [10], according to which every two disjoint languages of NRA with guessing are separable by a language recognized by an URA. Working in the special case when the NRA are complementing and have no guessing, we show more: both languages are then recognized not only by an URA but by a DRA.

NRA do not have good algorithmic properties: while the emptiness problem is PSpace-complete [14], the universality problem (does a given automaton accept all data words?) is undecidable [15] (it is decidable only for 1-NRA [14]). Universality becomes decidable for URA, as shown recently in [22] (2-ExpSpace upper bound, improved to 2-ExpTime upper bound in [8]), and language containment and equality for URA reduce polynomially to universality (see [8, Lemma 8]). As mentioned above, our results allow us to re-prove this decidability result.

Register automata have been intensively investigated, with respect both to their foundational properties [15,25,17,23] and to their applications to XML databases and logics [14] (see [26] for a survey). There are several other ways to extend finite-state machines with a capability to recognize languages over infinite alphabets. These include, apart from register automata: their abstract version – nominal automata or automata over atoms [4,5,1]; symbolic automata [13]; pebble automata [21]; and data automata [3,6].

*Acknowledgments.* We thank Lorenzo Clemente for posing the collapse question studied in this paper, and Joanna Ochremiak and Radek Pi´orkowski for valuable discussions.

#### **2 Data languages and register automata**

The model of register automata, as considered in this paper, is parametrized by an underlying relational structure Atoms over a finite vocabulary Σ. This structure constitutes a *data domain*; its elements are called *atoms*. A register automaton processes sequences of atoms, possibly coupled with labels from a fixed finite set. It may store atoms read from the input in its registers, and compare them with previously stored atoms using relations in Σ (equality included).

Here are some example data domains:


In the following we consider input alphabets of the form <sup>S</sup> <sup>×</sup> Atoms, where <sup>S</sup> is a finite set of labels. A *data word* is a finite sequence <sup>w</sup> <sup>∈</sup> (<sup>S</sup> <sup>×</sup> Atoms)∗, and a *data language* is a set of data words.

A *nondeterministic register automaton* (NRA) A consists of:

**–** an input alphabet of the form <sup>S</sup> <sup>×</sup> Atoms, for some finite set <sup>S</sup>,


$$(p, s, \varphi, \text{sr}, q) \in \Delta,\tag{1}$$

where p, q ∈ Q, s ∈ S, ϕ(x1,...,xr, x) is a quantifier-free Σ-formula with free variables in {x1,...,xr, x}, and st ∈ {1, . . . , r, <sup>n</sup>one}.

Intuitively, ϕ defines a condition which needs to be satisfied by the register contents (x1,...,xr) and by the current atom (x) for a transition to happen, and st specifies the register in which the input atom is stored after the transition, st = none meaning that it is not to be stored in any register.

An NRA A is *deterministic* (DRA) if it has exactly one initial state and if for every two transition rules

$$(p, s, \varphi\_1, \text{ST}\_1, q\_1), \ (p, s, \varphi\_2, \text{ST}\_2, q\_2) \in \Delta,$$

such that <sup>ϕ</sup><sup>1</sup> <sup>∧</sup> <sup>ϕ</sup><sup>2</sup> is satisfiable in Atoms, we have st<sup>1</sup> <sup>=</sup> st<sup>2</sup> and <sup>q</sup><sup>1</sup> <sup>=</sup> <sup>q</sup>2. We write r-NRA, resp. r-DRA, when the number of registers r is fixed.

A configuration <sup>q</sup>(**a**) <sup>∈</sup> <sup>Q</sup> <sup>×</sup> (Atoms ∪ {⊥})<sup>r</sup> of <sup>A</sup> consists of a control state <sup>q</sup> <sup>∈</sup> <sup>Q</sup> and a content of registers **<sup>a</sup>** <sup>∈</sup> (Atoms ∪ {⊥})<sup>r</sup>, where <sup>⊥</sup> means that the content of a register is undefined (i.e., the register is empty). A rule (1) induces a transition <sup>p</sup>(**a**) (s,a) −→ <sup>q</sup>(**b**) from a configuration <sup>p</sup>(**a**) to a configuration <sup>q</sup>(**b**) if:


A *run* of A on a data word w = (s1, a1)···(sn, an) is a sequence

$$q\_0(\mathbf{a}\_0) \stackrel{(s\_1, a\_1)}{\longrightarrow} q\_1(\mathbf{a}\_1) \stackrel{(s\_2, a\_2)}{\longrightarrow} \dots \stackrel{(s\_n, a\_n)}{\longrightarrow} q\_n(\mathbf{a}\_n),$$

where q<sup>0</sup> is an initial state and **a**<sup>0</sup> is a tuple where the content of all registers is undefined. We then say that the configuration qn(**a**n) is *reachable along* w. The finite set of all configurations reachable along w is finite, and it is denoted A(w).

A run is *accepting* if it ends in a configuration with an accepting state. A data word w is *accepted* by A if there is an accepting run of A on w. A NRA is *unambiguous* (URA) if every word has at most one accepting run.

The *language* of A, denoted L(A), is the set of all data words accepted by A.

#### **3 Examples**

In all our examples, the finite component S of data alphabets will be a singleton set. We will therefore omit S when describing automata, so (1) will simplify to

$$(p, \varphi, \operatorname{sT}, q) \in \Delta.$$

Graphically, a transition rule like this will be presented as

p <sup>ϕ</sup> <sup>↓</sup><sup>n</sup> <sup>q</sup> if st <sup>=</sup> n, and <sup>p</sup> ϕ q if st = none.

Furthermore, p means that p is initial and q means that q is accepting.

*Example 1.* For the equality atoms, consider the language <sup>L</sup> <sup>⊆</sup> Atoms<sup>∗</sup> of those words where the first letter appears at some later position:

$$L = \{a\_1 \ldots a\_n \mid n > 1, a\_1 = a\_i \text{ for some } i > 1\}.$$

This language is recognized by a DRA with one register and three control states:

This automaton stores the first letter in its only register and then remains in the (non-accepting) state q until the letter is encountered again; then it moves to the accepting state r and stays there.

*Example 2.* Still for the equality atoms, consider the *reverse* of the language from Example 1, i.e., the language of those words where the last letter appears at some earlier position. This language is not recognized by any DRA, but it is recognized by a NRA with one register and three control states:

This automaton nondeterministically decides to store a letter in its register and then checks that the last letter is equal to the stored one.

*Example 3.* Still for the equality atoms, consider the *complement* of the language from Example 2, i.e., the language L of those words where the last letter does *not* appear at any earlier position. (In particular, we consider the empty word and all length-one words to be in this language.)

The language L is not recognized by any NRA. However, it becomes recognizable if automata are additionally equipped with the ability of *guessing*, that is, of updating the contents of their registers with arbitrary atoms, possibly different from the one that comes with the current input letter. Unlike NRA without guessing, those with guessing are closed under reversal [18, Def. 3 and Corollary 31], and the reversal of the language L is even recognized by a DRA.

*Example 4.* Automata from Ex. 1-3 work just as well over the dense order domain: the formulas in their transition rules simply do not use the order relation. However, over densely ordered atoms something more happens: the language from Ex. 3 is recognizable by a NRA without guessing.

The automaton has two registers. The idea is that, at any moment in an accepting run where these registers store atoms a<sup>1</sup> < a2:

(a) in the part of the word read so far, no letter is in the open interval (a1, a2), (b) the last letter of the word will belong to that open interval.

Condition (a) can be ensured easily: upon reading a letter a that belongs to the open interval (a1, a2), the automaton will (enter an accepting state for the moment and) put a in one of the two registers. The register is chosen nondeterministically so that condition (b) remains true. If the currently input letter is not in the interval (a1, a2), the automaton enters a rejecting state for the moment, with the registers kept unchanged.

Special treatment is needed to deal with situations where the last letter of the word will be larger than (or smaller than) all the letters encountered so far. These are taken care of by introducing special control states where one of the two registers remains undefined.

*Example 5.* Fix k 2. Over equality atoms, consider the language L<sup>k</sup> of all words w of length at least k whose kth last letter is equal to the last letter. Then L<sup>k</sup> is recognised by a NRA with one register and k + 1 states, depicted below:

The complement of L<sup>k</sup> is also recognised by an NRA, similar to the one above, but with x = x<sup>1</sup> in place of x = x<sup>1</sup> in the last transition, and with an additional component for accepting words of length smaller than k. The language L<sup>k</sup> is also recognised by a DRA with k registers, where register number i stores the letter which appeared on the latest seen position with index congruent to i, mod k. It has k states, for counting the index of the current position, mod k.

### **4 Main results**

Our primary contribution is:

**Theorem 1.** *Over equality atoms, if a data language and its complement are both recognizable by nondeterministic register automata, then they are both recognizable by deterministic register automata.*

Note that this result fails if automata with guessing are considered (see Ex. 3). Indeed, the language from Ex. 2 is recognized by a 1-NRA, and its complement in Ex. 3 is recognized by a 1-NRA with guessing, but they are not deterministically recognizable.

Moreover, the result fails (even without guessing) for densely ordered atoms. The counterexample is the same: the language from Ex. 2 is recognized by a 1- NRA, and its complement is recognized by a 2-NRA over densely ordered atoms as explained in Ex. 4, but they are not deterministically recognizable. Here the use of two registers in NRA is necessary, due to our secondary contribution: for a wide range of data domains, if a data language and its complement are both recognized by 1-NRA, then they are recognized by DRA.

We prove this for any data domain Atoms which admits wqo in the following sense. A *well quasi-order* (wqo) is a quasi-order (Z, ) such that for every infinite sequence <sup>z</sup>1, z2,... <sup>∈</sup> <sup>Z</sup> there are 1 i<j with <sup>z</sup><sup>i</sup> <sup>z</sup><sup>j</sup> . For a finite set <sup>X</sup>, an <sup>X</sup>-labeled substructure of Atoms is a set <sup>B</sup> <sup>⊆</sup> Atoms together with a labelling <sup>B</sup> : <sup>B</sup> <sup>→</sup> <sup>X</sup>. For two <sup>X</sup>-labeled substructures <sup>B</sup> and <sup>C</sup> of Atoms, we say that B embeds into C (written B ( C) if some automorphism π of Atoms, restricted to B, yields a label-preserving injection from B to C, so that <sup>B</sup> = <sup>C</sup> ◦ <sup>π</sup> B. Let ageX(Atoms) be the set of all finite labeled substructures of Atoms, partially ordered by (. We say that Atoms *admits* wq<sup>o</sup> if for every finite set <sup>X</sup>, the quasi-order (ageX(Atoms), () is a wqo. All data domains listed in Section 2 admit wqo [19]. They are also *oligomorphic* (see Sec. 5 below).

**Theorem 2.** *Over any oligomorphic atoms that admit* wqo*, if a data language and its complement are both recognizable by nondeterministic register automataa with one register, then they are recognizable by deterministic register automata.*

The rest of the paper consists of the proofs of Thms. 1 and 2, in Sec. 6 and 8, respectively, preceded by Sec. 5 that recalls basic definitions of the setting of sets with atoms which are used in the proofs. Our main technical lemma is proved in Sec. 6. Besides proving Thm. 1, in Sec. 7 we explain how it implies decidability of universality for unambiguous register automata.

#### **5 Orbit-finite automata**

Our proofs rely on some basic notions and results of the theory of sets with atoms [1], also known as nominal sets [24]. In this section we recall what is necessary to follow our arguments; this is part of a uniform abstract approach to register automata developed in [4,5,1].

Let Aut(Atoms) denote the group of all automorphisms of a relational structure Atoms. (For the equality atoms (N, =) this means the group of all bijections; for the densely ordered atoms (Q, ), the group of monotone bijections.) We consider sets equipped with an action of this group, typically, Atoms<sup>n</sup> for some n 0 or Atoms<sup>∗</sup> with the componentwise action.

*Group actions.* A (left) action of a group G on a set X is a mapping · : G×X → X such that 1 ·x = x and σπ ·x = σ ·(π ·x) for all σ, π ∈ G and x ∈ X. We then say that G *acts* on X, or that X is a G*-set*. For x ∈ X, we call the set {π · x | π ∈ G} the orbit *of* x; or an orbit *in* X. The orbits in X partition X into disjoint sets. We call X *orbit-finite* if it has finitely many orbits.

Group actions canonically extend along familiar set-theoretic constructions: if X and Y are G-sets then the cartesian product X×Y , the disjoint union X0Y , the set of sequences X∗, the powerset P(X) etc. are all G-sets, in the expected way. For example, G acts componentwise on X × Y via π · (x, y)=(π · x, π · y).

*Oligomorphicity.* A structure Atoms is *oligomorphic* if for every <sup>n</sup> <sup>∈</sup> <sup>N</sup>, the componentwise action of Aut(Atoms) on Atoms<sup>n</sup> induces finitely many orbits. All structures considered in this paper are oligomorphic; an example of a nonoligomorphic structure is the total order of integers.

*Supports.* Let Aut(Atoms) act on a set <sup>X</sup> and let <sup>x</sup> <sup>∈</sup> <sup>X</sup>. A *support* of <sup>x</sup> is any set <sup>S</sup> <sup>⊆</sup> Atoms such that the following implication holds for all <sup>π</sup> <sup>∈</sup> Aut(Atoms):

if π(s) = s for all s ∈ S then π · x = x.

An element x ∈ X is *finitely supported* if it has some finite support.

For many structures Atoms, finite supports of a fixed element are always closed under intersections. Then every finitely supported x has *the least support*, denoted sup(x). This happens in particular for the equality atoms (as proved in [24, Prop. 2.3] or in [5, Cor. 9.4]) and for the dense order atoms (as proved in [5, Prop. 9.5]). It is easy to prove that taking least supports commutes with group actions: <sup>π</sup> · sup(x) = sup(<sup>π</sup> · <sup>x</sup>) for every <sup>x</sup> <sup>∈</sup> <sup>X</sup> and <sup>π</sup> <sup>∈</sup> Aut(Atoms).

*Equivariance.* An element (or a subset, relation, function. . . ) of an Aut(Atoms) set is called *equivariant* if it is supported by the empty set; equivalently, it is fixed by every automorphism of Atoms. For example:


Standard set-theoretic relations such as set membership, or set containment, are equivariant. Indeed, x ∈ Z ↔ (π · x) ∈ (π · Z), etc.

If <sup>∼</sup> is an equivariant equivalence relation on <sup>X</sup> then Aut(Atoms) acts on the set X/∼, by π · C = {π · x | x ∈ C} for each ∼-equivalence class C ⊆ X.

*Register automata.* Fix a structure Atoms and let R be an NRA with input alphabet <sup>S</sup> <sup>×</sup> Atoms, control states <sup>Q</sup>, and with <sup>r</sup> registers. The group Aut(Atoms) acts on all the components of R:


π · q(a1,...,ar) = q(π(a1),...,π(ar)) (where π(⊥) = ⊥);


Furthermore, each of these components is orbit-finite, and each of its elements has a finite support. Using the terminology of [5], this means that register automata are a special case of *orbit-finite automata*.

By equivariance of all the components above, the language L(R) of a register automaton is an equivariant subset of <sup>A</sup><sup>∗</sup> = (<sup>S</sup> <sup>×</sup> Atoms)∗, considered with the componentwise action of Aut(Atoms) on A∗, i.e.

$$
\pi \cdot ((s\_1, a\_1), \dots, (s\_n, a\_n)) = ((s\_1, \pi \cdot a\_1), \dots, (s\_n, \pi \cdot a\_n)).
$$

*Myhill-Nerode theorem.* In order to prove that a language is deterministically recognizable, we use the following Myhill-Nerode characterization.

For an alphabet <sup>A</sup> <sup>=</sup> <sup>S</sup> <sup>×</sup> Atoms and data language <sup>L</sup> <sup>⊆</sup> <sup>A</sup>∗, consider its Myhill-Nerode equivalence ∼<sup>L</sup> ⊆ A<sup>∗</sup> × A∗, defined by

u ∼<sup>L</sup> v if and only if uw ∈ L ↔ vw ∈ L for all w ∈ A∗.

**Theorem 3.** *[5, Thm. 3.8 and Thm. 6.4] Let* Atoms *be oligomorphic and* <sup>L</sup> <sup>⊆</sup> (<sup>S</sup> <sup>×</sup> Atoms)<sup>∗</sup> *be an equivariant language. Then* <sup>L</sup> *is deterministically recognizable if and only if* (<sup>S</sup> <sup>×</sup> Atoms)∗/∼<sup>L</sup> *is orbit-finite.*

Among other things, this theorem immediately implies that the language from Ex. 2 is not deterministically recognizable, neither for the equality atoms nor for the total order atoms. Indeed, two words are Myhill-Nerode equivalent with respect to that language if and only if they contain the same set of letters. Therefore, the language cannot be deterministically recognizable, since automorphisms of Atoms preserve the number of distinct letters in a word.

### **6 Proof of Theorem 1**

In the proof, we will make use of an abstract notion of a split of a family of sets.

For any family F of subsets of a set X, a *split* of F is a pair (U, V ) of sets which partition X: X = U 0 V , such that both U and V are *finite* unions of elements of F. Obviously, for any splits to exist, X = -F must hold.

In the following lemma, Atoms is the equality atoms.

**Lemma 1.** *For any* Aut(Atoms)*-set* X *with finitely supported elements, and any equivariant, orbit-finite family* F *of finitely supported subsets of* X*, the set* G *of splits of* F *is orbit-finite. Moreover, a bound on the number of orbits of* G *and the maximal size of the support of an element in* G *are computable from the analogous bounds for* F*.*

As should be clear after reading Sec. 5, the set of splits of F is considered with the natural action of Aut(Atoms): <sup>π</sup> · (U, V )=(<sup>π</sup> · U, π · <sup>V</sup> ), where <sup>π</sup> · <sup>W</sup> <sup>=</sup> {π · x | x ∈ W} for W ⊆ X.

We will prove Lem. 1 in Sec. 6.2. For now, let us show how the lemma implies Thm. 1.

Let <sup>A</sup> and <sup>B</sup> be two NRA over an alphabet <sup>A</sup> <sup>=</sup> <sup>S</sup> <sup>×</sup> Atoms such that <sup>L</sup>(A) and L(B) partition A∗. We will show that the Myhill-Nerode equivalence of L = L(A) has orbit-finitely many classes. Together with Thm. 3, this will prove that L is deterministically recognizable.

Let C be the set of configurations of A 0 B (the disjoint union of A and B.) Hence, C consists of tuples of the form q(**a**) where q is either a state of A or a state of <sup>B</sup> (but not both), and **<sup>a</sup>** is a tuple of elements of Atoms 0 {⊥} of appropriate length. For c ∈ C denote

L<sup>c</sup> := {w ∈ A<sup>∗</sup> | A 0 B accepts w from configuration c} ,

and let F = {L<sup>c</sup> | c ∈ C}. Since C is equivariant and orbit-finite, so is F. Moreover, if c = q(**a**) then L<sup>c</sup> is finitely supported by the atoms in **a**. Clearly, every word (s1, a1)···(sn, an) ∈ A<sup>∗</sup> is supported by {a1,...,a<sup>n</sup>}. This means that F and X = A<sup>∗</sup> satisfy the assumptions of Lem. 1, therefore F has only orbit-finitely many splits.

Every word v ∈ A<sup>∗</sup> induces a partition of A<sup>∗</sup> into two disjoint sets:

$$U\_v = \{ w \in A^\* \mid vw \in L \} \qquad \text{and} \qquad V\_v = \{ w \in A^\* \mid vw \notin L \} \dots$$

Moreover, the sets U<sup>v</sup> and V<sup>v</sup> are finite unions of sets from F, namely

$$U\_v = \bigcup\_{c \in \mathcal{A}(v)} L\_c \qquad \text{and} \qquad V\_v = \bigcup\_{c \in \mathcal{B}(v)} L\_c.$$

These unions are finite because automata A and B allow no guessing and so A(v) and B(v), the sets of configurations reachable in A resp. B by reading the word v, are finite. Therefore, (Uv, Vv) is a split of F, for any word v.

By definition, u ∼<sup>L</sup> v if and only if U<sup>u</sup> = Uv. Consider any two words v, w ∈ A<sup>∗</sup> such that the splits (Uv, Vv) and (Uw, Vw) are in the same orbit, i.e., U<sup>w</sup> = π · U<sup>v</sup> (and therefore also V<sup>w</sup> = π · Vv) for some automorphism π. Since L is an equivariant language, we have π · U<sup>v</sup> = U<sup>π</sup>·<sup>v</sup> and so w ∼<sup>L</sup> π · v. Theorem 1 now follows from Thm. 3.

#### **6.1 Examples**

Before proving Lem. 1, we give some examples of families of splits, which may be helpful in developing some intuitions.

The first example shows that the number of orbits of splits may grow as fast as double-exponentially, relative to the least supports of elements of F.

*Example 6.* For the equality atoms, fix k 1 and let X be the set of all k-tuples of pairwise distinct atoms. For each <sup>S</sup> <sup>⊆</sup> Atoms with <sup>|</sup>S<sup>|</sup> <sup>=</sup> <sup>k</sup>, let <sup>S</sup>(k) <sup>=</sup> <sup>S</sup><sup>k</sup> <sup>∩</sup> <sup>X</sup> and let <sup>M</sup><sup>S</sup> <sup>=</sup> <sup>X</sup> \ <sup>S</sup>(k) . Note that S(k) is finite, with k! elements.

The family F ⊆ P(X) of all singletons in X and all sets M<sup>S</sup> as above is equivariant and has two orbits. Each set in F has a support of size k.

For any <sup>K</sup> <sup>⊆</sup> <sup>S</sup>(k) , consider the partition of X into K and X \ K. Then (K, X \ K) is a split of F, as K = - <sup>v</sup>∈<sup>K</sup> {v} and <sup>X</sup> \ <sup>K</sup> <sup>=</sup> <sup>M</sup><sup>S</sup> <sup>∪</sup> - <sup>v</sup>∈S(k)\<sup>K</sup> {v}.

Moreover, every split (U, V ) of F is of the form (K, X \ K) or (X \ K, K) for some S and K as above. Indeed, suppose U = - U and V = - V for some finite U, V ⊆ F. As U ∪ V = X is infinite, U ∪ V must contain M<sup>S</sup> for some set S of k atoms. Suppose without loss of generality that M<sup>S</sup> ∈ U. By disjointness of <sup>U</sup> and <sup>V</sup> , the set <sup>V</sup> <sup>⊆</sup> <sup>F</sup> may only contain singletons {v}, for <sup>v</sup> <sup>∈</sup> <sup>S</sup>(k) . Then (U, V )=(X \ K, K), where K = -V.

For K, K <sup>⊆</sup> <sup>S</sup>(k) , the splits defined by K and K are in the same orbit only if there is an automorphism π that fixes S as a set, such that π · K = K . Since there are only k! bijections on S, the set of splits of F has at least <sup>2</sup>k! <sup>k</sup>! orbits.

The next example shows the difference between splits and the finite subfamilies of F that define those splits: the set of those families may be orbit-infinite.

*Example 7.* Let X be the set of all finite sets of equality atoms. For any distinct atoms a, b, define Ea,b, Da,b ⊆ X by:

$$E\_{a,b} = \{ F \in X \mid a \in F \leftrightarrow b \in F \} \qquad D\_{a,b} = X \backslash E\_{a,b}$$

And let F contain all sets Ea,b and Da,b. This F has two orbits.

Obviously, (U, V )=(X, ∅) is a split of F; it is enough to take U = {Da,b, Ea,b} and V = ∅ for any fixed a, b. However, there are many more minimal families U and V that achieve the same effect. Indeed, for any number n, and for any pairwise distinct atoms a1,...,an, consider:

$$\mathcal{U} = \{D\_{a\_1, a\_2}, D\_{a\_2, a\_3}, \dots, D\_{a\_{n-1}, a\_n}, E\_{a\_1, a\_n}\} \qquad \mathcal{V} = \emptyset$$

It is easy to check that - U = X. All such families are minimal (in fact, removing any element from U would prevent it from being the part of any split of F), and for each n these families form a separate orbit.

The following example shows that the statement of Lem. 1 fails if the atoms are (Q, ). It is obtained from Ex. 4 via the translation given in the proof of Thm. 1, and a simplification replacing each word by its last letter.

*Example 8.* The atoms are (Q, ). Let <sup>X</sup> <sup>=</sup> <sup>Q</sup> and let <sup>F</sup> <sup>⊆</sup> <sup>P</sup>(X) consist of:

**–** singletons {q} ⊆ <sup>X</sup>, for <sup>q</sup> <sup>∈</sup> <sup>Q</sup>;

**–** open intervals (p, q) <sup>⊆</sup> <sup>X</sup>, for p<q in <sup>Q</sup> ∪ {−∞, <sup>+</sup>∞}.

Then <sup>F</sup> has five orbits (here ±∞ are fixed under the action of Aut(Atoms)). For any finite set K ⊆ X, consider the partition of X into K and X \ K. Then K = - <sup>q</sup>∈<sup>K</sup> {q} whereas <sup>X</sup> \ <sup>K</sup> is the union of all intervals (p, q), where p<q are consecutive elements in K ∪ {−∞, +∞}. Hence, (K, X \ K) is a split of F. In particular, the set of all splits of F has infinitely many orbits, because the set of finite subsets of X has infinitely many orbits.

#### **6.2 Proof of Lemma 1**

We prove by induction a stronger statement, where the atoms are assumed to be an expansion of (N, =) by finitely many constants. In other words, in this section we will assume that Atoms is a structure over a vocabulary that consists of (equality and) a finite number of constant symbols; the universe of Atoms is N, with the constants interpreted as some pairwise distinct numbers. The group Aut(Atoms) then consists of all bijections of Atoms which fix every constant.

If Atoms is such a structure and T is a finite set of atoms all different from the constants, then by Atoms<sup>T</sup> we denote the structure, over an extended vocabulary, that arises from Atoms by interpreting all the atoms in T as additional constants. Obviously, Aut(Atoms<sup>T</sup> ) is a subgroup of Aut(Atoms), so every action of Aut(Atoms) on a set X restricts to an action of Aut(Atoms<sup>T</sup> ). This restriction preserves and reflects the existence of finite supports: an element <sup>x</sup> <sup>∈</sup> <sup>X</sup> is supported by some <sup>S</sup> in the action of Aut(Atoms) if and only if it is supported by <sup>S</sup> \ <sup>T</sup> in the restricted action of Aut(Atoms<sup>T</sup> ). In particular, if Atoms is an expansion of (N, =) by finitely many constants, then every finitely supported element x has a least support sup(x). Note that sup(x) never contains any constants, since those can always be safely removed from any support.

For a subset U of an orbit-finite equivariant set F, its *dimension* dim(U) is the maximum size of the least support of an element of U. This makes sense even if U is infinite, because F is orbit-finite and sets from the same orbit have least supports of the same size. In particular, dim(F) is well defined.

The following lemma says that adding constants to atoms preserves orbitfiniteness. It is a standard result in the theory of sets with atoms, see e.g. [1, Lem. 3.19] or [24, Lem. 5.22], indeed it is a fundamental property of oligomorphic structures, but we re-prove it here to extract explicit bounds:

**Lemma 2.** *Fix a finite set* <sup>T</sup> <sup>⊆</sup> Atoms*. For any orbit-finite* Aut(Atoms)*-set* <sup>F</sup> *with* l *orbits, the corresponding action of* Aut(Atoms<sup>T</sup> ) *on* F *is also orbit-finite, with at most* <sup>l</sup> · (|T<sup>|</sup> + 1)dim(F) *orbits.*

*Proof.* Assume first that F has only one orbit in the Aut(Atoms)-action, i.e., that l = 1. Let d = dim(F). Let Y denote the set of d-tuples of pairwise distinct atoms different from the constants in Atoms. This is a single-orbit set under the componentwise action of Aut(Atoms). Pick any <sup>x</sup><sup>0</sup> <sup>∈</sup> <sup>F</sup>. Let <sup>y</sup><sup>0</sup> = (a1,...,ad) <sup>∈</sup> Y be an enumeration of sup(x0). There is a unique equivariant surjection f : Y → <sup>X</sup> such that <sup>f</sup>(<sup>π</sup> · <sup>y</sup>0) = <sup>π</sup> · <sup>x</sup><sup>0</sup> for all <sup>π</sup> <sup>∈</sup> Aut(Atoms). (The function <sup>f</sup> is total since Y has one orbit; it is well defined because y<sup>0</sup> enumerates a support of x0, and it is surjective since X has one orbit.) Two tuples in Y are in the same orbit in the action of Aut(Atoms<sup>T</sup> ) if and only if they contain the same arrangement of atoms from <sup>T</sup> at the same positions. There are at most (|T|+ 1)<sup>d</sup> such arrangements, (in fact fewer than this if d > 1, because tuples in Y are pairwise distinct), so <sup>Y</sup> has at most (|T<sup>|</sup> + 1)<sup>d</sup> such orbits. <sup>X</sup> is an image of the equivariant function f : Y → X, so the same bound applies to X. For a set F with l orbits, each of dimension at most d, the bound simply multiplies by l.

From now on consider Atoms as described above, and let X and F be as in the statement of Lem. 1. The following key lemma says that every split of F has a support of a bounded size.

**Lemma 3.** *Let* U 0 V *be a split of* F *and let* U, V *be finite subfamilies of* F *such that* - U = U *and* - V = V *. Then* U *and* V *each have a support of size at most* N*, for some bound* N *computable only from* dim(U), dim(V), dim(F) *and the number of orbits in* F*.*

The crux of this lemma is that the number N does not depend on the split U 0 V . It only depends on the number of orbits in F, its dimension dim(F), and on dim(U) and dim(V) (which, anyway, are bounded from above by dim(F)).

*Proof (of Lem. 3).* We proceed by induction on k = dim(U) + dim(V). Fix k 0 and assume that the statement of the lemma holds for all smaller values of k. Without loss of generality, we may assume that ∅ does not belong to U nor V (as it can be safely removed from each of them).

For a finitely supported set F ⊆ X define

$$F^\sharp := \{ \pi \cdot y \mid \pi \in \text{Aut}(\text{ArroMs}), y \in F, \text{sup}(y) \cap \text{sup}(F) = \emptyset \}\dots$$

Intuitively, F arises by taking all elements of F that are "fresh for F", i.e., ones whose supports share no atoms with the support of F, and then by applying arbitrary atom automorphisms to those elements. Note that that F is equivariant and <sup>F</sup> = (<sup>π</sup> · <sup>F</sup>) for any automorphism <sup>π</sup>.

**Claim 1** X = - <sup>F</sup> <sup>∈</sup>U∪<sup>V</sup> <sup>F</sup>*.*

*Proof.* Take any x ∈ X. Let S = - <sup>F</sup> <sup>∈</sup>U∪<sup>V</sup> sup(F). Since <sup>U</sup> and <sup>V</sup> are finite, <sup>S</sup> is a finite set. Pick an automorphism π such that its inverse π−<sup>1</sup> maps sup(x) to a set disjoint with <sup>S</sup>. Consider the element <sup>y</sup> <sup>=</sup> <sup>π</sup>−<sup>1</sup> · <sup>x</sup> <sup>∈</sup> <sup>X</sup>. Since <sup>U</sup> <sup>∪</sup> <sup>V</sup> <sup>=</sup> <sup>X</sup>, there must be some <sup>F</sup> <sup>∈</sup> <sup>U</sup> <sup>∪</sup> <sup>V</sup> such that <sup>y</sup> <sup>∈</sup> <sup>F</sup>. Then <sup>x</sup> <sup>∈</sup> <sup>F</sup>.

Let us first prove the lemma for the **special case** where X = F for some <sup>F</sup> <sup>∈</sup> <sup>U</sup>∪V. Suppose that <sup>X</sup> <sup>=</sup> <sup>F</sup> for some <sup>F</sup> <sup>∈</sup> <sup>U</sup> (the case <sup>F</sup> <sup>∈</sup> <sup>V</sup> is symmetric).

**Claim 2** Every y ∈ X with sup(y) ∩ sup(F) = ∅ belongs to F.

*Proof.* Take any <sup>y</sup> as above. As <sup>X</sup> <sup>=</sup> <sup>F</sup>, there is some <sup>π</sup> and <sup>x</sup> <sup>∈</sup> <sup>F</sup> such that y = π · x and sup(x) ∩ sup(F) = ∅. Pick an automorphism θ such that:


Such a θ exists since sup(x) and sup(y) are both disjoint from F. Then θ · x = π·x = y by the first property above, and θ ·x ∈ θ ·F = F by the second property. Altogether, y ∈ F.

**Claim 3** For every G ∈ V, sup(F) ∩ sup(G) = ∅.

*Proof.* We show that if sup(G) is disjoint from sup(F) then G must be empty, contradicting our previous assumption.

Suppose x ∈ G. Pick an automorphism π which fixes sup(G) pointwise and maps sup(x) to a set disjoint with sup(F). Such a π exists because sup(G) and sup(F) are disjoint. Letting y := π · x, we have y ∈ F by Claim 2, and moreover y = π · x ∈ π ·G = G. Then y ∈ F ∩G ⊆ U ∩V = ∅, a contradiction. This proves G = ∅, which in turn contradicts the assumption that ∅ ∈ V.

Denote T = sup(F). If T = ∅ then by Claim 3, V has dimension 0 and therefore V is supported by the empty set. So we may assume that T = ∅. For the same reason we may assume that the family V is not empty.

Let Atoms<sup>T</sup> be obtained from Atoms by including the elements of T as new constants. Hence, Atoms<sup>T</sup> extends Atoms by at most r constants, where r := dim(F).

Let l be the number of orbits in F. By Lem. 2, the family F, treated as a family of sets over the atoms Atoms<sup>T</sup> , is still orbit-finite, with the number of orbits l depending only on l and r. Clearly, U 0 V remains a split of F. Note that if <sup>F</sup> <sup>∈</sup> <sup>F</sup> is supported by some set <sup>S</sup> over Atoms, then <sup>F</sup> is supported by <sup>S</sup>, indeed even by <sup>S</sup> \ <sup>T</sup>, over Atoms<sup>T</sup> . In particular, the dimension of <sup>F</sup> does not increase by moving from Atoms to Atoms<sup>T</sup> . More interestingly, by Claim 3, the least supports of all the elements in V actually *decrease* when considering Atoms<sup>T</sup> as atoms. Since V is not empty, the dimension of V strictly decreases and it follows that dim(U) + dim(V) < k over Atoms<sup>T</sup> . Applying the inductive assumption yields a set T of size N , depending on k − 1 and l , such that <sup>T</sup> supports <sup>V</sup> over Atoms<sup>T</sup> . By construction, <sup>V</sup> is supported by <sup>T</sup> <sup>∪</sup> <sup>T</sup> over Atoms. Note that

$$|T \cup T'| \lesssim N'' := N' + r.$$

This concludes the proof in the special case when <sup>X</sup> <sup>=</sup> <sup>F</sup> for some <sup>F</sup> <sup>∈</sup> <sup>U</sup> <sup>∪</sup> <sup>V</sup>. In the **general case**, for each F ∈ U ∪ V define:

$$\mathcal{F}\_F := \left\{ G \cap F^\sharp \mid G \in \mathcal{F} \right\}$$

$$\mathcal{U}\_F := \left\{ G \cap F^\sharp \mid G \in \mathcal{U} \right\} \qquad \qquad \mathcal{V}\_F := \left\{ G \cap F^\sharp \mid G \in \mathcal{V} \right\}$$

$$U\_F := U \cap F^\sharp = \bigcup \mathcal{U}\_F \qquad \qquad V\_F := V \cap F^\sharp = \bigcup \mathcal{V}\_F.$$

Then - F<sup>F</sup> = F and (U<sup>F</sup> , V<sup>F</sup> ) is a split of F<sup>F</sup> which falls into the special case considered above. Hence, U<sup>F</sup> has some support S<sup>F</sup> of size at most N.

Then U is supported by S := - <sup>F</sup> <sup>∈</sup>U∪<sup>V</sup> <sup>S</sup><sup>F</sup> . Note that <sup>S</sup><sup>F</sup> only depends on the orbit of <sup>F</sup>, as <sup>F</sup> = (<sup>π</sup> · <sup>F</sup>) for any automorphism <sup>π</sup>. As there are <sup>l</sup> such orbits contained in F, it follows that S has size at most N := Nl. This concludes the inductive step, and the proof of Lem. 3.

Using Lem. 3, we now proceed to prove Lem. 1.

*Proof (of Lemma 1).* Consider an equivariant set X and an equivariant, orbit finite family F of finitely supported subsets of X. Let ((Ui, Vi))<sup>i</sup>∈<sup>I</sup> be a family of splits of F. By Lem. 3, each one of these splits is supported by some set of a bounded size. Applying suitable automorphisms to each of these splits, we can obtain a family of splits ((U <sup>i</sup> , V <sup>i</sup> ))<sup>i</sup>∈<sup>I</sup> such that, for all i ∈ I:


It is now enough to show that there are only finitely many subsets U ⊆ X supported by a fixed set S, which are unions of elements of F.

By Lem. 2 it follows that F has finitely many orbits under the action of the group Aut(AtomsS) of all automorphisms which fix S pointwise. (Here, as in the statement of Lem. 1, Atoms are the pure equality atoms without any constants.) If a set U ⊆ X supported by S contains some F ∈ F as a subset, then it contains <sup>π</sup> · <sup>F</sup> for every <sup>π</sup> <sup>∈</sup> Aut(AtomsS). In other words, <sup>U</sup> contains (the union of) the entire orbit in F under the action of Aut(AtomsS). Since we assume that U is a union of elements of F, it is a union of (the unions of) orbits in F, and there are only finitely many of these.

This completes the proof of Lem. 1.

#### **7 Application to Unambiguous Register Automata**

Lemma 1 is interesting in its own right and its applications are not limited to the ones mentioned in Sec. 4. We shall now show how it can be used to decide universality (and hence also language containment and equality, cf. [8, Lem. 8]) of URA over the pure equality atoms Atoms.

**Theorem 4.** *[22, Thm. 14] The language containment and equality problems are decidable for unambiguous register automata.*

As an application of Lem. 1, we give an alternative decidability proof for the universality problem of URA. First, we prove a consequence of Lem. 1.

**Lemma 4.** *Let* X *be an equivariant set over equality atoms, and let* F *be an equivariant, orbit-finite family of finitely supported subsets of* X*. There is a bound* M*, computable from* dim(F) *and the number of orbits in* F*, such that every* P ⊆ F *which is a partition of* X *has size at most* M*.*

*Proof.* Let G = {U | (U, V ) is a split of F}. By Lem. 1, G is orbit-finite. Moreover, its elements are finitely supported. Let P ⊆ F be a partition of X into nonempty subsets. For each U ⊆ P, the union - U belongs to G; in particular, we have 2|P<sup>|</sup> elements of G, each containing different sets in P. The proof is completed by the following counting argument.<sup>1</sup>

Let S = - <sup>F</sup> <sup>∈</sup><sup>P</sup> sup(F). An <sup>S</sup>*-orbit* in <sup>G</sup> is an orbit in <sup>G</sup> with respect to the action of those atom permutations which fix S pointwise. Equivalently, it is an orbit in <sup>G</sup> viewed as a Aut(AtomsS)-set. By Lem. 2, for any finite <sup>S</sup> <sup>⊆</sup> Atoms, the number of <sup>S</sup>-orbits in <sup>G</sup> is bounded by <sup>l</sup> · (|S<sup>|</sup> + 1)<sup>k</sup>, where <sup>k</sup> and <sup>l</sup> are computable from dim(F) and the number of orbits of F.

Two splits G, G ∈ G in the same S-orbit contain the same elements of P: if G = π · G then by equivariance of F and G, for each F ∈ P we have F ⊆ G if and only if π · F ⊆ π · G, but π · F = F when π fixes S pointwise. Hence, for any two distinct U, U ⊆ P, their unions - U and - U belong to different S-orbits in <sup>G</sup>, so there are least 2|P<sup>|</sup> such orbits. As <sup>|</sup>S<sup>|</sup> dim(F) · |P|, we get:

<sup>2</sup>|P<sup>|</sup> <sup>l</sup> · (|S<sup>|</sup> + 1)<sup>k</sup> <sup>l</sup> · (dim(F) · |P<sup>|</sup> + 1)<sup>k</sup>.

It follows that |P| is bounded by some M computable from k,l, and dim(F).

<sup>1</sup> It exhibits the well-known fact that equality atoms have the NIP property studied in model theory.

Lemma 4 has the following corollary, which is a strong restriction on the structure of universal URA and easily yields Thm. 4.

Call a configuration c of a NRA A *nonempty* if the NRA accepts some word from this configuration, i.e., the following language is nonempty:

$$L\_c := \{ w \in A^\* \mid \mathcal{A} \text{ accepts } w \text{ from } c \} $$

Since NRA emptiness is decidable, it is not difficult to modify any given NRA to one with only nonempty configurations. This transformation preserves URA, so we may safely assume that we only consider URA with this property.

**Corollary 1.** *Let* A *be a* URA *with nonempty configurations and which accepts every input word. Then there is a computable bound* M *such that* A *may reach at most* M *different configurations when reading any given input word.*

*Proof.* Let <sup>A</sup> be an URA over an input alphabet <sup>A</sup> <sup>=</sup> <sup>S</sup> <sup>×</sup> Atoms. Let <sup>C</sup> be the set of configurations of A and let F := {L<sup>c</sup> | c ∈ C} . Note that dim(F) is not larger than the number of registers r of A, and the number of orbits in F is not larger than the number of orbits of configurations in A, which in turn is equal to the number of control states in <sup>A</sup> times the number of orbits in (Atoms0 {⊥})<sup>r</sup> (equal to the r + 1-st Bell number).

For each w ∈ A∗, the set A(w) ⊆ C of configurations reachable when reading w is finite, since A has no guessing. Unambiguity of A implies that the family

$$\mathcal{P}\_w := \{ L\_c \mid c \in \mathcal{A}(w) \} \subseteq \mathcal{F}$$

consists of pairwise disjoint sets. If additionally L(A) = A∗, then P<sup>w</sup> forms a partition of <sup>A</sup>∗, so <sup>|</sup>P<sup>w</sup><sup>|</sup> <sup>M</sup> where <sup>M</sup> is the bound from Lemma 4. As <sup>|</sup>A(w)<sup>|</sup> <sup>|</sup>P<sup>w</sup>|, this yields the corollary.

Decidability of universality of URA now follows using standard ideas.

*Proof (of Thm. 4, sketch).* We use the notation of the proof of Cor. 1. The idea is to construct the truncated powerset automaton whose states are sets of at most M states of A.

Let C denote the family of subsets of C of size at most M; then C is orbitfinite. We define a deterministic automaton A with an infinite, but orbit-finite state space C . Its transitions are <sup>X</sup> <sup>a</sup>−→ Y, for X, Y <sup>∈</sup> <sup>C</sup> such that

$$Y = \left\{ y \in C \; \middle| \; x \stackrel{a}{\longrightarrow} y \text{ in } \mathcal{A}, x \in X \right\}.$$

The initial state of A is the set C<sup>0</sup> ⊆ C of initial configurations of A (unless |C0| > M, but then L(A) = A<sup>∗</sup> by the corollary). Accepting states are all states X ∈ C which contain an accepting configuration of A. All the ingredients of A are equivariant, orbit-finite sets, so A is an *orbit-finite deterministic automaton*, and can be effectively constructed given A and M. Its language L(A ) is defined as usual. By construction,

**–** L(A ) ⊆ L(A) ⊆ A∗;

$$- \text{ if } L(\mathcal{A}) = A^\* \text{ then } L(\mathcal{A'}) = A^\* \text{, by Cor. 1.}$$

Hence, A is universal if and only if A is universal. Since A is orbit-finite, universality of A can be effectively decided, using standard techniques for orbitfinite automata [1,5]: by first complementing and then testing emptiness.

#### **8 Proof of Theorem 2**

Towards proving Thm. 2, assume A and B are two complementing 1-NRA over an alphabet <sup>A</sup> <sup>=</sup> <sup>S</sup> <sup>×</sup> Atoms and that Atoms admit wqo.

Recall that configurations of a 1-NRA are either of the form q(a) where q is a control state and <sup>a</sup> <sup>∈</sup> Atoms is the register value, or of the form <sup>q</sup>(⊥) when the register value is still undefined. We assume, without losing generality, that both register automata A and B immediately update their register, i.e., every transition rule outgoing from an initial state updates the register.

Let Q and Q denote sets of control states of A and B, respectively, and assume without losing generality that Q and Q are disjoint.

For every nonempty data word <sup>w</sup> <sup>∈</sup> <sup>A</sup><sup>+</sup>, the set <sup>A</sup>(w) <sup>∪</sup> <sup>B</sup>(w) of configurations of A and B reachable along w is finite, since NRA have no guessing, and contains no undefined configurations of the form q(⊥) due to the immediate update assumption. For every <sup>w</sup> <sup>∈</sup> <sup>A</sup><sup>+</sup> define a finite induced substructure <sup>C</sup><sup>w</sup> of Atoms, labeled with the finite set <sup>P</sup> <sup>=</sup> <sup>P</sup>(<sup>Q</sup> <sup>∪</sup> <sup>Q</sup> ), as follows. The elements of C<sup>w</sup> are the atoms that appear in configurations in A(w) ∪ B(w):

$$\mathcal{C}\_w = \{ a \in \text{Aroms} \mid (q, a) \in \mathcal{A}(w) \cup \mathcal{B}(w) \text{ for some state } q \}$$

The labeling <sup>w</sup> : C<sup>w</sup> → P of C<sup>w</sup> maps a ∈ C<sup>w</sup> to the set of all control states which appear in A(w) ∪ B(w) together with a:

$$\ell\_w(a) = \{ q \in Q \mid (q, a) \in \mathcal{A}(w) \} \cup \{ q \in Q' \mid (q, a) \in \mathcal{B}(w) \} \dots$$

Let L = L(A). For each v ∈ A<sup>∗</sup> define the partition of A<sup>∗</sup> into:

$$U\_v = \{ w \in A^\* \mid vw \in L \} \qquad \text{and} \qquad V\_v = \{ w \in A^\* \mid vw \notin L \}.$$

Recall that u ∼<sup>L</sup> v if and only if U<sup>u</sup> = Uv.

*Claim.* Let u, v <sup>∈</sup> <sup>A</sup><sup>+</sup>. If <sup>C</sup><sup>u</sup> ( <sup>C</sup><sup>v</sup> then <sup>π</sup> · <sup>u</sup> <sup>∼</sup><sup>L</sup> <sup>v</sup> for some automorphism <sup>π</sup>.

*Proof.* By definition of (, there is some <sup>π</sup> <sup>∈</sup> Aut(Atoms) which maps <sup>C</sup><sup>u</sup> to a substructure of Cv, so that π · C<sup>u</sup> ⊆ C<sup>v</sup> and

$$\ell\_u(a) = \ell\_v(\pi(a)) \qquad \text{for } a \in \mathbb{C}\_u. \tag{2}$$

Let u = π · u. By equivariance of register automata, if A reaches a configuration (q, a) when reading u, then it reaches the configuration (q, π(a)) when reading u = π · u. Hence, C<sup>u</sup>- ⊆ C<sup>v</sup> and <sup>u</sup>(a) = u- (π(a)) for a ∈ Cu. Together with (2) we get u- (a) = <sup>v</sup>(a) for all a ∈ C<sup>u</sup>-.

We show that this implies U<sup>u</sup>- = Uv, which will yield the claim as u = π · u. Towards proving U<sup>u</sup>- ⊆ U<sup>v</sup> take any w ∈ U<sup>u</sup>- ; then u w ∈ L. Pick an accepting run of A on u w. Let q(a) be the configuration of A in this run reached after reading the (nonempty) prefix u . In particular, A accepts w starting from the configuration q(a). Moreover, a ∈ C<sup>u</sup> and q ∈ <sup>u</sup>(a). As C<sup>u</sup>- ⊆ C<sup>v</sup> and u- (a) = <sup>v</sup>(a), it follows that A may reach the configuration q(a) after reading v. As w is accepted by A from this configuration, it follows that A accepts vw, so w ∈ Uv.

The inclusion V<sup>u</sup>- ⊆ V<sup>v</sup> is proved by a similar argument, using B instead of A, since L(B) = A<sup>∗</sup> \ L(A) = A<sup>∗</sup> \ L. As U<sup>u</sup>- = A<sup>∗</sup> \ V<sup>u</sup> and V<sup>v</sup> = A<sup>∗</sup> \ Uv, the inclusion V<sup>u</sup>- ⊆ V<sup>v</sup> implies U<sup>u</sup>- ⊇ Uv. Altogether, U<sup>u</sup>- = Uv, so u ∼<sup>L</sup> v, yielding the claim.

Theorem 2 now follows easily: assume towards a contradiction that A∗/∼<sup>L</sup> is not orbit-finite. Then there is an infinite set <sup>X</sup> <sup>⊆</sup> <sup>A</sup><sup>+</sup> such that <sup>π</sup>(u) ∼<sup>L</sup> <sup>v</sup> for all distinct u, v <sup>∈</sup> <sup>X</sup> and <sup>π</sup> <sup>∈</sup> Aut(Atoms). As Atoms admits wqo, there are distinct u, v ∈ X such that C<sup>u</sup> ( Cv. The claim above yields a contradiction.

### **9 Final remarks**

We have studied a deterministic collapse for NRA: if a language and its complement are both recognized by NRA then they are also recognized by DRA. We have proved this for register automata over equality atoms; and for automata with one register only, over any atoms that admit wqo. We have also applied our key technical observation, namely orbit-finiteness of the set of splits of an orbit-finite family of sets, in order to re-prove decidability of universality of URA.

The assumed form <sup>A</sup> <sup>=</sup> <sup>S</sup> <sup>×</sup> Atoms of the input alphabets is not important; the results apply to arbitrary orbit-finite input alphabets A.

The proof of our main result (also of decidability of universality of URA) is effective, with elementary bounds. In particular, given two NRA with complementing languages the equivalent DRA from Thm. 1 has an exponential number of registers and a doubly-exponential number of orbits of states. The same bounds apply to a DRA constructed in our proof of Thm. 4. Moreover, assuming Atoms satisfy standard effectiveness assumptions, like decidability of their first-order theory, one can also compute an equivalent DRA from Thm. 2.

Concerning possible generalisations of our results, we believe that Thm. 1 holds not only for equality atoms, but for arbitrary oligomorphic ω-stable atoms. These include e.g. the nested equality atoms mentioned in Sec. 2. On the other hand Thm. 1 does not extend to disjoint but non-complementing NRA languages: it is not true that for every two disjoint NRA languages there is a DRA language that *separates* them, i.e., includes one of them and is disjoint from the other. The corresponding decision problem (given two disjoint NRA, does a separating DRA exist?) is decidable when the number of registers of a separating automaton is fixed [9], and open in general.

An intriguing open question (not unlike the wqo Dichotomy Conjecture [19]) is whether it is necessary for Atoms to admit wqo for Thm. 2 to hold.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

#### **Certifying Inexpressibility***-*

Orna Kupferman<sup>1</sup> and Salomon Sickert<sup>1</sup>,<sup>2</sup> (-)

<sup>1</sup> School of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel. orna@cs.huji.ac.il, salomon.sickert@mail.huji.ac.il <sup>2</sup> Technische Universit¨at M¨unchen, Munich, Germany. s.sickert@tum.de

**Abstract** Different classes of automata on infinite words have different expressive power. Deciding whether a given language L ⊆ Σ<sup>ω</sup> can be expressed by an automaton of a desired class can be reduced to deciding a game between Prover and Refuter: in each turn of the game, Refuter provides a letter in Σ, and Prover responds with an annotation of the current state of the run (for example, in the case of B¨uchi automata, whether the state is accepting or rejecting, and in the case of parity automata, what the color of the state is). Prover wins if the sequence of annotations she generates is correct: it is an accepting run iff the word generated by Refuter is in L. We show how a winning strategy for Refuter can serve as a simple and easy-to-understand certificate to inexpressibility, and how it induces additional forms of certificates. Our framework handles all classes of deterministic automata, including ones with structural restrictions like weak automata. In addition, it can be used for refuting separation of two languages by an automaton of the desired class, and for finding automata that approximate L and belong to the desired class.

**Keywords:** Automata on infinite words · Expressive power · Games.

#### **1 Introduction**

Finite *automata on infinite objects* were first introduced in the 60's, and were the key to the solution of several fundamental decision problems in mathematics and logic [8,33,41]. Today, automata on infinite objects are used for specification, verification, and synthesis of nonterminating systems. The automatatheoretic approach reduces questions about systems and their specifications to questions about automata [28,49], and is at the heart of many algorithms and tools. Industrial-strength property-specification languages such as the IEEE 1850

 The full version of this article is available from [27]. Orna Kupferman is supported in part by the Israel Science Foundation, grant No. 2357/19. Salomon Sickert is supported in part by the Deutsche Forschungsgemeinschaft (DFG) under project numbers 436811179 and 317422601 ("Verified Model Checkers"), and in part funded by the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme under grant agreement No. 787367 (PaVeS).

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 385–405, 2021. https://doi.org/10.1007/978-3-030-71995-1 20

Standard for Property Specification Language (PSL) [14] include regular expressions and/or automata, making specification and verification tools that are based on automata even more essential and popular.

A run r of an automaton on infinite words is an infinite sequence of states, and acceptance is determined with respect to the set of states that r visits infinitely often. For example, in *B¨uchi* automata, some of the states are designated as accepting states, denoted by α, and a run is accepting iff it visits states from the accepting set α infinitely often [8]. Dually, in *co-B¨uchi* automata, a run is accepting if it visits the set α only finitely often. Then, in *parity* automata, the acceptance condition maps each state to a color in some set C = {j, . . . , k}, for j ∈ {0, 1} and some *index* k ≥ 0, and a run is accepting if the maximal color it visits infinitely often is odd.

The different classes of automata have different *expressive power*. For example, while deterministic parity automata can recognize all ω-regular languages, deterministic B¨uchi automata cannot [29]. We use DBW, DCW, and DPW to denote a deterministic B¨uchi, co-B¨uchi, and parity word automaton, respectively, or (this would be clear from the context) the set of languages recognizable by the automata in the corresponding class. There has been extensive research on expressiveness of automata on infinite words [48,20]. In particular, researchers have studied two natural expressiveness hierarchies induced by different classes of deterministic automata. The first hierarchy is the *Mostowski Hierarchy*, induced by the index of parity automata [35,50]. Formally, let DPW[0, k] denote a DPW with C = {0,...,k}, and similarly for DPW[1, k] and C = {1,...,k}. Clearly, DPW[0, k] ⊆ DPW[0, k + 1], and similarly DPW[1, k] ⊆ DPW[1, k + 1]. The hierarchy is infinite and strict. Moreover, DPW[0, k] complements DPW[1, k + 1], and for every k ≥ 0, there are languages L<sup>k</sup> and L <sup>k</sup> such that L<sup>k</sup> ∈ DPW[0, k] \ DPW[1, k + 1] and L <sup>k</sup> ∈ DPW[1, k + 1] \ DPW[0, k]. At the bottom of this hierarchy, we have DBW and DCW. Indeed, DBW=DPW[0, 1] and DCW=DPW[1, 2].

While the Mostowski Hierarchy refines DPWs, the second hierarchy, which we term the *depth hierarchy*, refines deterministic *weak* automata (DWWs). Weak automata can be viewed as a special case of B¨uchi or co-B¨uchi automata in which every strongly connected component in the graph induced by the structure of the automaton is either contained in α or is disjoint from α, where α is depending on the acceptance condition the set of accepting or rejecting states. The structure of weak automata captures the alternation between greatest and least fixed points in many temporal logics, and they were introduced in this context in [36]. DWWs have been used to represent vectors of real numbers [6], and they have many appealing theoretical and practical properties [32,21]. In terms of expressive power, DWW = DCW ∩ DBW.

The depth hierarchy is induced by the depth of alternation between accepting and rejecting components in DWWs. For this, we view a DWW as a DPW in which the colors visited along a run can only increase. Accordingly, each run eventually gets trapped in a single color, and is accepting iff this color is odd. We use DWW[0, k] and DWW[1, k] to denote weak-DPW[0, k] and weakDPW[1, k], respectively. The picture obtained for the depth hierarchy is identical to that of the Mostowski hierarchy, with DWW[j, k] replacing DPW[j, k] [50]. At the bottom of the depth hierarchy we have *co-safety* and *safety* languages [2]. Indeed, co-safety languages are DWW[0, 1] and safety are DWW[1, 2].

Beyond the theoretical interest in expressiveness hierarchies, their study is motivated by the fact many algorithms, like synthesis and probabilistic model checking, need to operate on deterministic automata [5,3]. The lower the automata are in the expressiveness hierarchy, the simpler are algorithms for reasoning about them. Simplicity goes beyond complexity, which typically depends on the parity index [16], and involves important practical considerations like minimization and canonicity (exists only for DWWs [32]), circumvention of Safra's determinization [26], and symbolic implementations [47]. Of special interest is the characterization of DBWs. For example, it is shown in [25] that given a *linear temporal logic* formula ψ, there is an *alternation-free* μ*-calculus* formula equivalent to ∀ψ iff ψ can be recognized by a DBW. Further research studies *typeness* for deterministic automata, examining the ability to define a weaker acceptance condition on top of a given automaton [19,21].

Our goal in this paper is to provide a simple and easy-to-understand explanation to inexpressibility results. The need to accompany results of decision procedures by an explanation (often termed "certificate") is not new, and includes certification of a "correct" decision of a model checker [24,44], reachability certificates in complex multi-agent systems [1], and explainable reactive synthesis [4]. To the best of our knowledge, our work is the first to provide certification to inexpressibility results.

The underlying idea is simple: Consider a language L and a class γ of deterministic automata. We consider a turn-based two-player game in which one player (Refuter) provides letters in Σ, and the second player (Prover) responds with letters from a set A of annotations that describe states in a deterministic automaton. For example, when we consider a DBW, then A = {acc, rej}, and when we consider a DPW[0, k], then A = {0,...,k}. Thus, during the interaction, Refuter generates a word <sup>x</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup> and Prover responds with a word <sup>y</sup> <sup>∈</sup> <sup>A</sup><sup>ω</sup>. Prover wins if for all words <sup>x</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup>, we have that <sup>x</sup> <sup>∈</sup> <sup>L</sup> iff <sup>y</sup> is accepting according to γ. Clearly, if there is a deterministic γ automaton for L, then Prover can win by following its run on x. Dually, a finite-state winning strategy for Prover induces a deterministic γ automaton for L. The game-based approach is not new, and has been used for deciding the membership of given ω-regular languages in different classes of deterministic automata [26]. Further, the game-based formulation is used in descriptive set theory to classify sets into hierarchies, see for example [39, Chapters 4 and 5] for an introduction that focuses on ω-regular languages. Our contribution is a study of strategies for Refuter. Indeed, since the above described game is determined [9] and the strategies are finite-state, Refuter has a winning strategy iff no deterministic γ automaton for L exists, and this winning strategy can serve as a certificate for inexpressibility.

*Example 1.* Consider the language <sup>L</sup>°<sup>a</sup> ⊆ {a, b}<sup>ω</sup> of all words with only finitely many a's. It is well known that L cannot be recognized by a DBW [29]. In Fig-

**Figure 1.** A refuter for DBW-recognizability of "only finitely many a's".

ure 1 we describe what we believe to be the neatest proof of this fact. The figure describes a transducer R with inputs in {acc,rej} and outputs in {a, b} – the winning strategy of Refuter in the above described game. The way to interpret R is as follows. In each round of the game, Prover tells Refuter whether the run of her DBW for L°<sup>a</sup> is in an accepting or a rejecting state, and Refuter uses R in order to respond with the next letter in the input word. For example, if Prover starts with acc, namely declaring that the initial state of her DBW is accepting, then Refuter responds with a, and if Prover continues with rej, namely declaring that the state reachable with a is rejecting, then Refuter responds with b. If Prover continues with rej forever, then Prover continues with b forever. Thus, together Prover and Refuter generate two words: <sup>y</sup> ∈ {acc,rej}<sup>ω</sup> and <sup>x</sup> ∈ {a, b}<sup>ω</sup>. Prover wins whenever <sup>x</sup> <sup>∈</sup> <sup>L</sup>°<sup>a</sup> iff <sup>y</sup> contains infinitely many acc's. If Prover indeed has a DBW for L°<sup>a</sup>, then she can follow its transition function and win the game. By following the refuter R, however, Refuter can always fool Prover and generate a word x such that x ∈ L°<sup>a</sup> iff y contains only finitely many acc's.

We first define refuters for DBW-recognizability, and study their construction and size for languages given by deterministic or nondeterministic automata. Our refuters serve as a first inexpressibility certificate. We continue and argue that each DBW-refuter for a language L induces three words x ∈ Σ<sup>∗</sup> and x1, x<sup>2</sup> ∈ Σ∗, such that <sup>x</sup>·(x1+x2)<sup>∗</sup> ·x<sup>ω</sup> <sup>1</sup> ⊆ L and x·(x<sup>∗</sup> <sup>1</sup> ·x2)<sup>ω</sup> <sup>∩</sup><sup>L</sup> <sup>=</sup> <sup>∅</sup>. The triple x, x1, x<sup>2</sup> is an additional certificate for L not being in DBW. Indeed, we show that a language L is not in DBW iff it has a certificate as above. For example, the language L°<sup>a</sup> has a certificate , b, a. In fact, we show that Landweber's proof for L°<sup>a</sup> can be used as is for all languages not in DBW, with x<sup>1</sup> replacing b, x<sup>2</sup> replacing a, and adding x as a prefix.

We then generalize our results on DBW-refutation and certification in two orthogonal directions. The first is an extension to richer classes of deterministic automata, in particular all classes in the two hierarchies discussed above, as well as all deterministic Emerson-Lei automata (DELWs) [17]. For the depth hierarchy, we add to the winning condition of the game a *structural restriction*. For example, in a weak automaton, Prover loses if the sequence <sup>y</sup> <sup>∈</sup> <sup>A</sup><sup>ω</sup> of annotations she generates includes infinitely many alternations between acc and rej. We show how structural restrictions can be easily expressed in our framework.

The second direction is an extension of the recognizability question to the questions of *separation* and *approximation*: We say that a language <sup>L</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup> is <sup>a</sup> *separator* for two languages <sup>L</sup>1, L<sup>2</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup> if <sup>L</sup><sup>1</sup> <sup>⊆</sup> <sup>L</sup> and <sup>L</sup> <sup>∩</sup> <sup>L</sup><sup>2</sup> <sup>=</sup> <sup>∅</sup>. Studies of separation include a search for regular separators of general languages [11], as well as separation of regular languages by weaker classes of languages, e.g., FO-definable languages [40] or piecewise testable languages [12]. In the context of ω-regular languages, [2] presents an algorithm computing the smallest safety language containing a given language L1, thus finding a safety separator for L<sup>1</sup> and L2. As far as we know, besides this result there has been no systematic study of separation of ω-regular languages by deterministic automata.

In addition to the interest in separators, we use them in the context of recognizability in two ways. First, a third type of certificate that we suggest for DBW-refutation of a language L are "simple" languages L<sup>1</sup> and L<sup>2</sup> such that L<sup>1</sup> ⊆ L, L ∩ L<sup>2</sup> = ∅, and L1, L<sup>2</sup> are not DBW-separable. Second, we use separability in order to approximate languages that are not in DBW. Consider such a language <sup>L</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup>. A user may be willing to approximate <sup>L</sup> in order to obtain DBW-recognizability. Specifically, we assume that there are languages I<sup>↓</sup> ⊆ L and <sup>I</sup><sup>↑</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup> \<sup>L</sup> of words that the user is willing to under- and over-approximate L with. Thus, the user searches for a language that is a separator for L \ I<sup>↓</sup> and <sup>Σ</sup><sup>ω</sup> \ (<sup>L</sup> <sup>∪</sup> <sup>I</sup>↑). We study DBW-separability and DBW-approximation, namely separability and approximation by languages in DBW. In particular, we are interested in finding "small" approximating languages I<sup>↓</sup> and I<sup>↑</sup> with which L has a DBW-approximation, and we show how certificates that refute DBW-separation can direct the search to for successful I<sup>↓</sup> and I↑. Essentially, as in *counterexample guided abstraction-refinement* (CEGAR) for model checking [10], we use certificates for non-DBW-separability in order to suggest interesting *radius languages*. While in CEGAR the refined system excludes the counterexample, in our setting the approximation of L excludes the certificate. As has been the case with recognizability, we extend our results to all classes of deterministic automata.

### **2 Preliminaries**

#### **2.1 Transducers and Realizability**

Consider two finite alphabets Σ and A. It is convenient to think about Σ as the "main" alphabet, and about A as an alphabet of annotations. For two words <sup>x</sup> <sup>=</sup> <sup>x</sup><sup>0</sup> ·x<sup>1</sup> ·x<sup>2</sup> ···∈ <sup>Σ</sup><sup>ω</sup> and <sup>y</sup> <sup>=</sup> <sup>y</sup><sup>0</sup> ·y<sup>1</sup> ·y<sup>2</sup> ···∈ <sup>A</sup><sup>ω</sup>, we define <sup>x</sup>⊕<sup>y</sup> as the word in (Σ×A)<sup>ω</sup> obtained by merging <sup>x</sup> and <sup>y</sup>. Thus, <sup>x</sup>⊕<sup>y</sup> = (x0, y0)·(x1, y1)·(x2, y2)··· .

A (Σ/A)*-transducer* models a finite-state system that responds with letters in A while interacting with an environment that generates letters in Σ. Formally, a (Σ/A)-transducer is T = Σ, A, ι, S, s0, ρ, τ , where ι ∈ {*sys*, *env*} indicates who initiates the interaction – the system or the environment, S is a set of states, s<sup>0</sup> ∈ S is an initial state, ρ : S ×Σ → S is a transition function, and τ : S → A is a labelling function on the states. Consider an input word <sup>x</sup> <sup>=</sup> <sup>x</sup>0·x1·x<sup>2</sup> ···∈ <sup>Σ</sup><sup>ω</sup>. The *run* of T on x is the sequence s0, s1, s<sup>2</sup> ... such that for all j ≥ 0, we have that s<sup>j</sup>+1 = ρ(s<sup>j</sup> , x<sup>j</sup> ). The *annotation of* x *by* T , denoted T (x), depends on ι. If <sup>ι</sup> <sup>=</sup> *sys*, then <sup>T</sup> (x) = <sup>τ</sup> (s0) · <sup>τ</sup> (s1) · <sup>τ</sup> (s2)··· ∈ <sup>A</sup><sup>ω</sup>. Note that the first letter in A is the output of T in s0. This reflects the fact that the system initiates the

interaction. If <sup>ι</sup> <sup>=</sup> *env*, then <sup>T</sup> (x) = <sup>τ</sup> (s1)· <sup>τ</sup> (s2)· <sup>τ</sup> (s3)···∈ <sup>A</sup><sup>ω</sup>. Note that now, the output in s<sup>0</sup> is ignored, reflecting the fact that the environment initiates the interaction.

Consider a language <sup>L</sup> <sup>⊆</sup> (<sup>Σ</sup> <sup>×</sup> <sup>A</sup>)<sup>ω</sup>. Let *comp*(L) denote the complement of <sup>L</sup>. Thus, *comp*(L)=(<sup>Σ</sup> <sup>×</sup> <sup>A</sup>)<sup>ω</sup> \ <sup>L</sup>. We say that a language <sup>L</sup> <sup>⊆</sup> (<sup>Σ</sup> <sup>×</sup> <sup>A</sup>)<sup>ω</sup> is (Σ/A)*-realizable by the system* if there is a (Σ/A)-transducer T with ι = *sys* such that for every word <sup>x</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup>, we have that <sup>x</sup> ⊕ T (x) <sup>∈</sup> <sup>L</sup>. Then, <sup>L</sup> is (A/Σ)*-realizable by the environment* if there is an (A/Σ)-transducer T with <sup>i</sup> <sup>=</sup> *env* such that for every word <sup>y</sup> <sup>∈</sup> <sup>A</sup><sup>ω</sup>, we have that <sup>T</sup> (y) <sup>⊕</sup> <sup>y</sup> <sup>∈</sup> <sup>L</sup>. When the language L is regular, realizability reduces to deciding a game with a regular winning condition. Then, by determinacy of games and due to the existence of finite-memory winning strategies [9], we have the following.

**Proposition 1.** *For every* <sup>ω</sup>*-regular language* <sup>L</sup> <sup>⊆</sup> (<sup>Σ</sup> <sup>×</sup>A)<sup>ω</sup>*, exactly one of the following holds.*

*1.* L *is* (Σ/A)*-realizable by the system.*

*2. comp*(L) *is* (A/Σ)*-realizable by the environment.*

#### **2.2 Automata**

A *deterministic word automaton* over a finite alphabet Σ is A = Σ, Q, q0, δ, α, where Q is a set of states, q<sup>0</sup> ∈ Q is an initial state, δ : Q×Σ → Q is a transition function, and α is an acceptance condition. We extend δ to words in Σ<sup>∗</sup> in the expected way, thus for q ∈ Q, w ∈ Σ∗, and letter σ ∈ Σ, we have that δ(q, ) = q and <sup>δ</sup>(q, wσ) = <sup>δ</sup>(δ(q, w), σ). A *run* of <sup>A</sup> on an infinite word <sup>σ</sup>0, σ1, ··· ∈ <sup>Σ</sup><sup>ω</sup> is the sequence of states r = q0, q1,... , where for every position i ≥ 0, we have that q<sup>i</sup>+1 = δ(qi, σi). We use *inf* (r) to denote the set of states that r visits infinitely often. Thus, *inf* (r) = {q : q<sup>i</sup> = q for infinitely many i ≥ 0}.

The acceptance condition α refers to *inf* (r) and determines whether the run r is accepting. For example, in the *B¨uchi*, acceptance condition, we have that α ⊆ Q, and a run is accepting iff it visits states in α infinitely often; that is, α ∩ *inf* (r) = ∅. Dually, in *co-B¨uchi*, α ⊆ Q, and a run is accepting iff it visits states in α only finitely often; that is, α∩*inf* (r) = ∅. The language of A, denoted L(A), is then the set of words w such that the run of A on w is accepting.

A parity condition is α : Q → {0,...,k}, for k ≥ 0, termed the *index* of α. A run <sup>r</sup> satisfies <sup>α</sup> iff the maximal color <sup>i</sup> ∈ {0,...,k} such that <sup>α</sup>−<sup>1</sup>(i)∩*inf* (r) <sup>=</sup> <sup>∅</sup> is odd. That is, r is accepting iff the maximal color that r visits infinitely often is odd. Then, a Rabin condition is α = {G1, B<sup>1</sup>,...,Gk, B<sup>k</sup>}, with Gi, B<sup>i</sup> ⊆ Q, for all 0 ≤ i ≤ k. A run r satisfies α iff there is 1 ≤ i ≤ k such that *inf* (r)∩G<sup>i</sup> = ∅ and *inf* (r)∩ B<sup>i</sup> = ∅. Thus, there is a pair Gi, B<sup>i</sup> such that r visits states in G<sup>i</sup> infinitely often and visits states in B<sup>i</sup> only finitely often.

All the acceptance conditions above can be viewed as special cases of the *Emerson-Lei acceptance condition* (EL-condition, for short) [17], which we define below. Let <sup>M</sup> be a finite set of marks. Given an infinite sequence <sup>π</sup> <sup>=</sup> <sup>M</sup>0·M<sup>1</sup> ···∈ (2<sup>M</sup>)<sup>ω</sup> of subsets of marks, let *inf* (π) be the set of marks that appear infinitely often in sets in <sup>π</sup>. Thus, *inf* (π) = {<sup>m</sup> <sup>∈</sup> <sup>M</sup> : there exist infinitely many <sup>i</sup> <sup>≥</sup> <sup>0</sup> such that <sup>m</sup> <sup>∈</sup> <sup>M</sup><sup>i</sup>}. An EL-condition is a Boolean assertion over atoms in <sup>M</sup>. For simplicity, we consider assertions in positive normal form, where negation is applied only to atoms. Intuitively, marks that appear positively should repeat infinitely often and marks that appear negatively should repeat only finitely often. Formally, a deterministic EL-automaton is <sup>A</sup> <sup>=</sup> Σ, Q, q0, δ, <sup>M</sup>,τ,θ, where <sup>τ</sup> : <sup>Q</sup> <sup>→</sup> <sup>2</sup><sup>M</sup> maps each state to a set of marks, and <sup>θ</sup> is an EL-condition over <sup>M</sup>. A run r of a A is accepting if *inf* (τ (r)) satisfies θ.

For example, a B¨uchi condition α ⊆ Q can be viewed as an EL-condition with <sup>M</sup> <sup>=</sup> {acc} and <sup>τ</sup> (q) = {acc} for <sup>q</sup> <sup>∈</sup> <sup>α</sup> and <sup>τ</sup> (q) = <sup>∅</sup> for <sup>q</sup> ∈ <sup>α</sup>. Then, the assertion θ = acc is satisfied by sequences π induced by runs r with *inf* (r)∩α = <sup>∅</sup>. Dually, the assertion <sup>θ</sup> <sup>=</sup> <sup>¬</sup>rej with <sup>M</sup> <sup>=</sup> {rej} is satisfied by sequences <sup>π</sup> induced by runs r with *inf* (r) ∩ α = ∅, and thus corresponds to a co-B¨uchi condition. In the case of a parity condition α : Q → {0,...,k}, it is not hard to see that <sup>α</sup> is equivalent to an EL-condition in which <sup>M</sup> <sup>=</sup> {0, <sup>1</sup>,...,k}, for every state q ∈ Q, we have that τ (q) = {α(q)}, and θ expresses the parity condition. Lastly, a Rabin condition α = {G1, B<sup>1</sup>,...,Gk, B<sup>k</sup>} is equivalent to an ELcondition with <sup>M</sup> <sup>=</sup> {G1, B1,...,Gk, B<sup>k</sup>} and <sup>τ</sup> (q) = {<sup>m</sup> <sup>∈</sup> <sup>M</sup> : <sup>q</sup> <sup>∈</sup> <sup>m</sup>}. Note that now, the mapping τ is not to singletons, and each state is marked by all sets in α in which it is a member. Then, θ = <sup>1</sup>≤i≤<sup>k</sup>(G<sup>i</sup> ∧ ¬Bi).

We use DBW, DCW, DPW, DRW, DELW to denote deterministic B¨uchi, co-B¨uchi, parity, Rabin, and EL word automata, respectively. For parity automata, we also use DPW[0, k] and DPW[1, k], for k ≥ 0, to denote DPWs in which the colours are in {0,...,k} and {1,...,k}, respectively. For Rabin automata, we use DRW[k], for k ≥ 0, to denote DRWs that have at most k elements in α. Finally, we use DELW[θ], to denote DELWs with EL-condition θ. We sometimes use the above acronyms in order to refer to the set of languages that are recognizable by the corresponding class of automata. For example, we say that a language L is in DBW if L is *DBW-recognizable*, thus there is a DBW A such that L = L(A). Note that DBW = DPW[0, 1], DCW = DPW[1, 2], and DRW[1] = DPW[0, 2]. In fact, in terms of expressiveness, DRW[k] = DPW[0, 2k] [43,31].

Consider a directed graph G = V,E. A *strongly connected set* of G (SCS) is a set C ⊆ V of vertices such that for every two vertices v, v ∈ C, there is a path from v to v . An SCS C is *maximal* if it cannot be extended to a larger SCS. Formally, for every nonempty C ⊆ V \ C, we have that C ∪ C is not an SCS. The maximal strongly connected sets are also termed *strongly connected components* (SCC). An automaton A = Σ, Q, Q0, δ, α induces a directed graph G<sup>A</sup> = Q, E in which q, q  ∈ E iff there is a letter σ such that q ∈ δ(q, σ). When we talk about the SCSs and SCCs of A, we refer to those of GA. Consider a run r of an automaton A. It is not hard to see that the set *inf* (r) is an SCS. Indeed, since every two states q and q in *inf* (r) are visited infinitely often, the state q must be reachable from q.

#### **3 Refuting DBW-Recognizability**

Let <sup>A</sup> <sup>=</sup> {acc, rej}. We use <sup>∞</sup>acc to denote the subset {a<sup>0</sup> · <sup>a</sup><sup>1</sup> · <sup>a</sup><sup>2</sup> ··· ∈ <sup>A</sup><sup>ω</sup> : there are infinitely many j ≥ 0 with a<sup>j</sup> = acc} and °acc = *comp*(∞acc) = {a<sup>0</sup> · <sup>a</sup><sup>1</sup> · <sup>a</sup><sup>2</sup> ···∈ <sup>A</sup><sup>ω</sup> : there are only finitely many <sup>j</sup> <sup>≥</sup> 0 with <sup>a</sup><sup>j</sup> <sup>=</sup> acc}.

A DBW A = Σ, Q, q0, δ, α can be viewed as a (Σ/A)-transducer T<sup>A</sup> = Σ, A, *sys*, Q, q0, δ, τ , where for every state q ∈ Q, we have that τ (q) = acc if <sup>q</sup> <sup>∈</sup> <sup>α</sup>, and <sup>τ</sup> (q) = rej otherwise. Then, for every word <sup>x</sup> <sup>∈</sup> <sup>Σ</sup><sup>ω</sup>, we have that x ∈ L(A) iff TA(x) ∈ ∞acc.

For a language <sup>L</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup>, we define the language DBW(L) <sup>⊆</sup> (<sup>Σ</sup> <sup>×</sup> <sup>A</sup>)<sup>ω</sup> of words with correct annotations. Thus,

$$\text{DBW}(L) = \{ x \oplus y : x \in L \text{ iff } y \in \infty \text{ACC} \}.$$

Note that *comp*(DBW(L)) is the language

NoDBW(L) = {x ⊕ y : (x ∈ L and y ∈ ∞acc) or (x ∈ L and y ∈ ∞acc)}.

A *DBW-refuter for* L is an (A/Σ)-transducer with ι = *env* realizing NoDBW(L).

*Example 2.* For every language <sup>R</sup> <sup>⊆</sup> <sup>Σ</sup><sup>∗</sup> of finite words, the language <sup>R</sup><sup>ω</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup> consists of infinite concatenations of words in R. It was recently shown that R<sup>ω</sup> may not be in DBW [30]. The language used in [30] is R = \$ + (0 · {0, 1, \$}<sup>∗</sup> · 1). In Figure 2 below we describe a DBW-refuter for R<sup>ω</sup>.

**Figure 2.** A DBW-refuter for (\$ + (0 · {0, 1, \$}<sup>∗</sup> · 1))ω.

Following R, Refuter starts by generating a prefix 0 · 1 and then responds to acc with 1 and responds with \$ to rej. Accordingly, if Prover generates a rejecting run, Prover generates a word in 0 · <sup>1</sup> ·(1 +\$)<sup>∗</sup> · \$<sup>ω</sup>, which is in <sup>R</sup><sup>ω</sup>. Also, if Prover generates an accepting run, Prover generates a word in 0 · <sup>1</sup> ·(1<sup>+</sup> · \$∗)<sup>ω</sup>, which has a single 0 and infinitely many 1's, and is therefore not in R<sup>ω</sup>.

By Proposition 1, we have the following.

**Proposition 2.** *Consider a language* <sup>L</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup>*. Let* <sup>A</sup> <sup>=</sup> {acc, rej}*. Exactly one of the following holds:*


#### **3.1 Complexity**

In this section we analyze the size of refuters. We start with the case where the language L is given by a DPW.

**Theorem 1.** *Consider a DPW* A *with* n *states. Let* L = L(A)*. One of the following holds.*


*Proof.* If L is in DBW, then, as DPWs are B¨uchi type [19], a DBW for L can be defined on top of the structure of A, and so it has n states. If L is not in DBW, then by Proposition 2, there is a DBW-refuter for L, namely a ({acc, rej}/Σ) transducer that realizes NoDBW(L). We show we can define a DRW U with 2n states for NoDBW(L). The result then follows from the fact a realizable DRW is realized by a transducer of the same size as the DRW [15].

We construct U by taking the union of the acceptance conditions of a DRW U<sup>1</sup> for {x ⊕ y : x ∈ L and y ∈ ∞acc} and a DRW U<sup>2</sup> for {x ⊕ y : x ∈ L and y ∈ ∞acc}. We obtain both DRWs by taking the product of A, extended to the alphabet Σ × {acc, rej}, with a 2-state automaton for ∞acc, again extended to the alphabet Σ × {acc, rej}.

We describe the construction in detail. Let A = Σ, Q, q0, δ, α. Then, the state space of U<sup>1</sup> is Q × {acc, rej} and its transition on a letter σ, a follows δ when it reads σ, with a determining whether U<sup>1</sup> moves to the acc or rej copy. Let α<sup>1</sup> be the Rabin condition equivalent to α. We obtain the acceptance condition of U<sup>1</sup> by replacing each pair G, B in α<sup>1</sup> by G × {rej}, B × {rej} ∪ Q × {acc}. It is not hard to see that a run of U<sup>1</sup> satisfies the latter pair iff its projection on Q satisfies the pair G, B and its projection on {acc, rej} has only finitely many acc. The construction of U<sup>2</sup> is similar, with α<sup>2</sup> being a Rabin condition that complements α, and then replacing each pair G, B in α<sup>2</sup> by G × {acc}, B × {acc, rej}). Since U<sup>1</sup> and U<sup>2</sup> have the same state space, and we only have to take the union of the pairs in their acceptance conditions, the 2n bound follows.

Now, when L is given by an NBW, an exponential bound follows from the exponential blow up in determinization [42]. If we are also given an NBW for *comp*(L), the complexity can be tightened. Formally, we have the following.

**Theorem 2.** *Given NBWs with* n *and* m *states, for* L *and comp*(L)*, respectively, one of the following holds.*


*Proof.* If L is in DBW, then a DBW for L can be defined on top of a DPW for L, which has at most (1.65n)<sup>n</sup> states [45], or by dualizing a DCW for *comp*(L). Since the translation of an NBW with m states to a DCW, when it exists, results in a DCW with 3<sup>m</sup> states [7], we are done. If L is not in DBW, then we proceed as in the proof of Theorem 1, defining U on the top of a DPW for either L or *comp*(L).

#### **3.2 Certifying DBW-Refutation**

Consider a DBW-refuter R = {acc, rej},Σ, *env*, S, s0, ρ, τ . We say that a path <sup>s</sup>0,...,s<sup>m</sup> in <sup>R</sup> is an rej<sup>+</sup>-path if it contains at least one transition and all the transitions along it are labeled by rej; thus, for all 0 ≤ j<m, we have that s<sup>j</sup>+1 = ρ(s<sup>j</sup> , rej). Then, a path s0,...,s<sup>m</sup> in R is an acc-path if it contains at least one transition and its first transition is labeled by acc. Thus, s<sup>1</sup> = ρ(s0, acc).

**Lemma 1.** *Consider a DBW-refuter* R = {acc, rej},Σ, *env*, S, s0, ρ, τ *. Then there exists a state* <sup>s</sup> <sup>∈</sup> <sup>S</sup>*, a (possibly empty) path* <sup>p</sup> <sup>=</sup> <sup>s</sup>0, s1,...sm*, a* rej<sup>+</sup>*-cycle* p<sup>1</sup> = s<sup>1</sup> 0, s<sup>1</sup> <sup>1</sup> ...s<sup>1</sup> <sup>m</sup><sup>1</sup> *, and an* acc*-cycle* p<sup>2</sup> = s<sup>2</sup> 0, s<sup>2</sup> <sup>1</sup> ...s<sup>2</sup> <sup>m</sup><sup>2</sup> *, such that* s<sup>m</sup> = s<sup>1</sup> <sup>0</sup> = s1 <sup>m</sup><sup>1</sup> = s<sup>2</sup> <sup>0</sup> = s<sup>2</sup> <sup>m</sup><sup>2</sup> = s*.*

*Proof.* Let s<sup>i</sup> ∈ S be a reachable state that belongs to an ergodic component in the graph of R (that is, s<sup>i</sup> ∈ C, for a set C of strongly connected states that can reach only states in C). Since R is responsive, in the sense it can read in each round both acc and rej, we can read from s<sup>i</sup> the input sequence rej<sup>ω</sup>. Hence, <sup>R</sup> has a rej<sup>+</sup>-path <sup>s</sup>i,...,sl,...,s<sup>k</sup> with <sup>s</sup><sup>l</sup> <sup>=</sup> <sup>s</sup>k, for l<k. It is easy to see that the claim holds with s = sl. In particular, since R is responsive and C is strongly connected, there exists an acc-cycle from s<sup>l</sup> to itself.

**Theorem 3.** *An* ω*-regular language* L *is not in DBW iff there exist three finite words* <sup>x</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> *and* <sup>x</sup>1, x<sup>2</sup> <sup>∈</sup> <sup>Σ</sup><sup>+</sup>*, such that* <sup>x</sup> · (x<sup>1</sup> <sup>+</sup> <sup>x</sup>2)<sup>∗</sup> · <sup>x</sup><sup>ω</sup> <sup>1</sup> ⊆ L *and* x · (x<sup>∗</sup> 1 · <sup>x</sup>2)<sup>ω</sup> <sup>∩</sup> <sup>L</sup> <sup>=</sup> <sup>∅</sup>.

*Proof.* Assume first that L is not in DBW. Then, by Theorem 2, there exists a DBW-refuter <sup>R</sup> for it. Let <sup>p</sup> <sup>=</sup> <sup>s</sup>0, s1,...sm, <sup>p</sup><sup>1</sup> <sup>=</sup> <sup>s</sup><sup>1</sup> 0, s<sup>1</sup> 1,...,s<sup>1</sup> <sup>m</sup><sup>1</sup> , and p<sup>2</sup> = s2 0, s<sup>2</sup> 1,...,s<sup>2</sup> <sup>m</sup><sup>2</sup> , be the path, rej<sup>+</sup>-cycle, and acc-cycle that are guaranteed to exist by Lemma 1. Let x, x1, and x<sup>2</sup> be the outputs that R generates along them. Formally, <sup>x</sup> <sup>=</sup> <sup>τ</sup> (s1)·<sup>τ</sup> (s2)··· <sup>τ</sup> (sm), <sup>x</sup><sup>1</sup> <sup>=</sup> <sup>τ</sup> (s<sup>1</sup> <sup>1</sup>)·<sup>τ</sup> (s<sup>1</sup> <sup>2</sup>)··· <sup>τ</sup> (s<sup>1</sup> <sup>m</sup><sup>1</sup> ), and x<sup>2</sup> = τ (s<sup>2</sup> <sup>1</sup>) · <sup>τ</sup> (s<sup>2</sup> <sup>1</sup>)··· <sup>τ</sup> (s<sup>2</sup> <sup>m</sup><sup>2</sup> ). Note that as the environment initiates the interaction, the first letter in the words x, x1, and x2, are the outputs in the second states in p, p1, and p2. The final step, i.e., that x, x1, and x<sup>2</sup> satisfy the two conditions of the theorem, can be found in the full version of this article [27].

For the other direction, we adjust Landweber's proof [29] for the non-DBWrecognizability of °a to L. Essentially, °a can be viewed as a special case of <sup>x</sup>·(x1+x2)<sup>∗</sup> ·x<sup>ω</sup> <sup>1</sup> , with x = , x<sup>1</sup> = b, and x<sup>2</sup> = a. Assume by way of contradiction that there is a DBW A with L(A) = L. Let A = Σ, Q, q0, δ, α. Consider the infinite word <sup>w</sup><sup>0</sup> <sup>=</sup> <sup>x</sup> · <sup>x</sup><sup>ω</sup> <sup>1</sup> . Since <sup>w</sup><sup>0</sup> <sup>∈</sup> <sup>x</sup> · (x<sup>1</sup> <sup>+</sup> <sup>x</sup>2)<sup>∗</sup> · <sup>x</sup><sup>ω</sup> <sup>1</sup> , and so w ∈ L, the run of A on w<sup>0</sup> is accepting. Thus, there is i<sup>1</sup> ≥ 0 such that A visits α when it reads the <sup>x</sup><sup>1</sup> suffix of <sup>x</sup> · <sup>x</sup><sup>i</sup><sup>1</sup> <sup>1</sup> . Consider now the infinite word <sup>w</sup><sup>1</sup> <sup>=</sup> <sup>x</sup> · <sup>x</sup><sup>i</sup><sup>1</sup> <sup>1</sup> · <sup>x</sup><sup>2</sup> · <sup>x</sup><sup>ω</sup> <sup>1</sup> . Since w<sup>1</sup> is also in L, the run of A on w<sup>1</sup> is accepting. Thus, there is i<sup>2</sup> ≥ 0 such that <sup>A</sup> visits <sup>α</sup> when it reads the <sup>x</sup><sup>1</sup> suffix of <sup>x</sup> · <sup>x</sup><sup>i</sup><sup>1</sup> <sup>1</sup> · <sup>x</sup><sup>2</sup> · <sup>x</sup><sup>i</sup><sup>2</sup> <sup>1</sup> . In a similar fashion we can continue to find indices i1, i2,... such for all j ≥ 1, we have that A visits α when it reads the <sup>x</sup><sup>1</sup> suffix of <sup>x</sup> · <sup>x</sup><sup>i</sup><sup>1</sup> <sup>1</sup> · <sup>x</sup><sup>2</sup> · <sup>x</sup><sup>i</sup><sup>2</sup> <sup>1</sup> · x<sup>2</sup> ··· x<sup>2</sup> · x ij <sup>1</sup> . Since Q is finite, we can construct a word w ∈ x · (x<sup>∗</sup> <sup>1</sup> · <sup>x</sup>2)<sup>ω</sup> that is accepted, but we assumed that x · (x<sup>∗</sup> <sup>1</sup> · <sup>x</sup>2)<sup>ω</sup> <sup>∩</sup> <sup>L</sup> <sup>=</sup> <sup>∅</sup>, and thus we have reached a contradiction. The details of this step are given in [27].

We refer to a triple x, x1, x<sup>2</sup> of words that satisfy the conditions in Theorem 3 as a *certificate* to the non-DBW-recognizability of L.

*Example 3.* In Example 2, we described a DBW-refuter for L = (\$+(0·{0, 1, \$}<sup>∗</sup>· 1))<sup>ω</sup>. A certificate to its non-DBW-recognizability is x, x1, x<sup>2</sup>, with <sup>x</sup> = 01, <sup>x</sup><sup>1</sup> <sup>=</sup> \$, and <sup>x</sup><sup>2</sup> = 1. Indeed, 01 · (\$ + 1)<sup>∗</sup> · \$<sup>ω</sup> <sup>⊆</sup> <sup>L</sup> and 01 · (\$<sup>∗</sup> · 1)<sup>ω</sup> <sup>∩</sup> <sup>L</sup> <sup>=</sup> <sup>∅</sup>.

Note that obtaining certificates according to the proof of Theorem 3 may not give us the shortest certificate. For example, for L in Example 3, the proof would give us <sup>x</sup> = 01\$, <sup>x</sup><sup>1</sup> <sup>=</sup> \$, and <sup>x</sup><sup>2</sup> = 1\$, with 01\$ · (\$ + 1\$)<sup>∗</sup> · \$<sup>ω</sup> <sup>⊆</sup> <sup>L</sup> and <sup>01</sup>\$ ·(\$<sup>∗</sup> · <sup>1</sup>\$)<sup>ω</sup> <sup>∩</sup><sup>L</sup> <sup>=</sup> <sup>∅</sup>. The problem of generating smallest certificates is related to the problem of finding smallest witnesses to DBW non-emptiness [22] and is harder. Formally, defining the length of a certificate x, x1, x<sup>2</sup> as |x|+|x1|+|x2|, we have the following (see proof in [27]):

**Theorem 4.** *Consider a DPW* A *and a threshold* l ≥ 1*. The problem of deciding whether there is a certificate of length at most* l *for non-DBW-recognizability of* L(A) *is NP-complete, for* l *given in unary or binary.*

*Remark 1.* **[Relation with existing characterizations]** By [29], the language of a DPW A = Σ, Q, q0, δ, α is in DBW iff for every accepting SCS C ⊆ Q and SCS C ⊇ C, we have that C is accepting. The proof of Landweber relies on a complicated analysis of the structural properties of A. As we elaborate in the full version [27], Theorem 3, which relies instead on determinacy of games, suggests an alternative proof. Similarly, [50] examines the structure of a deterministic Muller automaton, and Theorem 3 can be viewed as a special case of Lemma 14 there, with a proof based on the game setting. 

Being an (A/Σ)-transducer, every DBW-refuter R is responsive and may generate many different words in <sup>Σ</sup><sup>ω</sup>. Below we show that we can leave <sup>R</sup> responsive and yet let it generate only words induced by a certificate. Formally, we have the following.

**Lemma 2.** *Given a certificate* x, x1, x<sup>2</sup> *to non-DBW-recognizability of a language* <sup>L</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup>*, we can define a refuter* <sup>R</sup> *for* <sup>L</sup> *such that for every* <sup>y</sup> <sup>∈</sup> <sup>A</sup><sup>ω</sup>*, if* y |= ∞acc*, then* R(y) ∈ x · (x<sup>∗</sup> <sup>1</sup> · <sup>x</sup>2)<sup>ω</sup>*, and if* <sup>y</sup> <sup>|</sup><sup>=</sup> °acc*, then* <sup>R</sup>(y) <sup>∈</sup> <sup>x</sup> · (x<sup>1</sup> <sup>+</sup> <sup>x</sup>2)<sup>∗</sup> · <sup>x</sup><sup>ω</sup> 1 *.*

*Proof.* Intuitively, R first ignores the inputs and outputs x. It then repeatedly outputs either x<sup>1</sup> or x2, according to the following policy: in the first iteration, R outputs x1. If during the output of x<sup>1</sup> all inputs are rej, then R outputs x<sup>1</sup> also in the next iteration. If an input acc has been detected, thus the prover tries to accept the constructed word, the refuter outputs x<sup>2</sup> in the next iteration, again keeping track of an acc input. If no acc has been input, R switches back to outputting x1. The formal definition of R can be found in [27].

By Theorem 3, every language not in DBW has a certificate x, x1, x<sup>2</sup>. As we argue below, these certificates are linear in the number of states of the refuters.

**Lemma 3.** *Let* <sup>R</sup> *be a DBW-refuter for* <sup>L</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup> *with* <sup>n</sup> *states. Then,* <sup>L</sup> *has a certificate of the form* x, x1, x<sup>2</sup> *such that* |x| + |x1| + |x2| ≤ 2 · n*.*

*Proof.* The paths p, p1, and p<sup>2</sup> that induce x, x<sup>1</sup> and x<sup>2</sup> in the proof of Theorem 3 are simple, and so they are all of length at most n. Also, while these paths may share edges, we can define them so that each edge appears in at most two paths. Indeed, if an edge appears in all three path, we can shorten p. Hence, |x| + |x1| + |x2| ≤ 2 · n, and we are done.

**Theorem 5.** *Consider a language* <sup>L</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup> *not in DBW. The length of a certificate for the non-DBW-recognizability of* L *is linear in a DPW for* L *and is exponential in an NBW for* L*. These bounds are tight.*

*Proof.* The upper bounds follow from Theorem 1 and Lemma 3, and the exponential determinization of NBWs. The lower bound in the NBW case follows from the exponential lower bound on the size of shortest non-universality witnesses for non-deterministic finite word automata (NFW) [34]. We sketch the reduction: Let L<sup>n</sup> ⊆ {0, 1}<sup>∗</sup> be a language such that the shortest witness for non-universality of L<sup>n</sup> is exponential in n, but L<sup>n</sup> has a polynomial sized NFW. We then define L <sup>n</sup> = (L<sup>n</sup> · \$ · (0<sup>∗</sup> · 1)<sup>ω</sup>) + ((0 + 1)<sup>∗</sup> · \$ · (0 + 1)<sup>∗</sup> · <sup>0</sup><sup>ω</sup>). It is clear that L <sup>n</sup> has a NBW polynomial in n and is not DBW-recognizable. Note that for every word <sup>w</sup> <sup>∈</sup> <sup>L</sup>n, we have <sup>w</sup> · \$ · (0 + 1)<sup>ω</sup> <sup>⊆</sup> <sup>L</sup> <sup>n</sup>. Thus, in order to satisfy Theorem 3, every certificate x, x1, x<sup>2</sup> needs to have w·\$ as prefix of x, for some w /∈ Ln. Hence, it is exponential in the size of the NBW.

*Remark 2.* **[LTL]** When the language L is given by an LTL formula ϕ, then DBW(ϕ) = ϕ ↔ **GF**acc and thus an off-the-shelf LTL synthesis tool can be used to extract a DBW-refuter, if one exists. As for complexity, a doublyexponential upper bound on the size of a DPW for NoDBW(L), and then also on the size of DBW-refuters and certificates, follows from the double-exponential translation of LTL formulas to DPWs [49,42]. The length of certificates, however, and then, by Lemma 2, also the size of a minimal refuter, is related to the *diameter* of the DPW for NoDBW(L), and we leave its tight bound open.

### **4 Separability and Approximations**

Consider three languages <sup>L</sup>1, L2, L <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup>. We say that <sup>L</sup> is a *separator* for L1, L<sup>2</sup> if L<sup>1</sup> ⊆ L and L<sup>2</sup> ∩ L = ∅. We say that a pair of languages L1, L<sup>2</sup> is *DBW-separable* iff there exists a language L in DBW such that L is a separator for L1, L<sup>2</sup>.

*Example 4.* Let <sup>Σ</sup> <sup>=</sup> {a, b}, <sup>L</sup><sup>1</sup> = (<sup>a</sup> <sup>+</sup> <sup>b</sup>)<sup>∗</sup> · <sup>b</sup><sup>ω</sup>, and <sup>L</sup><sup>2</sup> = (<sup>a</sup> <sup>+</sup> <sup>b</sup>)<sup>∗</sup> · <sup>a</sup><sup>ω</sup>. By [29], L<sup>1</sup> and L<sup>2</sup> are not in DBW. They are, however, DBW-separable. A witness for this is <sup>L</sup> = (a<sup>∗</sup> · <sup>b</sup>)<sup>ω</sup>. Indeed, <sup>L</sup><sup>1</sup> <sup>⊆</sup> <sup>L</sup>, <sup>L</sup>∩L<sup>2</sup> <sup>=</sup> <sup>∅</sup>, and <sup>L</sup> is DBW-recognizable.

Consider a language <sup>L</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup>, and suppose we know that <sup>L</sup> is not in DBW. A user may be willing to approximate L in order to obtain DBW-recognizability. Specifically, we assume that there is a language <sup>I</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup> of words that the user is *indifferent* about. Formally, the user is satisfied with a language in DBW that agrees with L on all words that are not in I. Formally, we say that a language L *approximates* L *with radius* I if L \ I ⊆ L ⊆ L ∪ I. It is easy to see that, equivalently, L is a separator for L \ I, *comp*(L ∪ I). Note that the above formulation embodies the case where the user has in mind different overand under-approximation radiuses, thus separating L \ I↓, *comp*(L ∪ I↑) for possibly different I<sup>↓</sup> and I↑. Indeed, by defining I = (I<sup>↓</sup> ∩ L) ∪ (I<sup>↑</sup> \ L), we get L \ I, *comp*(L ∪ I) = L \ I↓, *comp*(L) \ I↑).

It follows that by studying DBW-separability, we also study DBW-approximation, namely approximation by a language that is in DBW, possibly with different over- and under-approximation radiuses.

*Remark 3.* **[From recognizability to separation]** It is easy to see that DBWseparability generalizes DBW-recognizability, as L is in DBW iff L, *comp*(L) is DBW-separable. Given <sup>L</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup>, we say that a pair of languages <sup>L</sup>1, L<sup>2</sup> is a *no-DBW-witness* for L if L is a separator for L1, L<sup>2</sup> and L1, L<sup>2</sup> is not DBW-separable. Note that the latter indeed implies that L is not in DBW.

A simple no-DBW witness for L can be obtained as follows. Let R be a DBW refuter for L. Then, we define L<sup>1</sup> = {R(y) : y ∈ ¬∞acc} and L<sup>2</sup> = {R(y) : y ∈ ∞acc}. By the definition of DBW-refuters, we have L<sup>1</sup> ⊆ L and L<sup>2</sup> ∩ L = ∅, and so L1, L<sup>2</sup> is a no-DBW witness for L. It is simple, in the sense that when we describe L<sup>1</sup> and L<sup>2</sup> by a tree obtained by pruning the Σ∗-tree, then each node has at most two children – these that correspond to the responses of R to acc and rej.

#### **4.1 Refuting Separability**

For a pair of languages <sup>L</sup>1, L<sup>2</sup>, we define the language SepDBW(L) <sup>⊆</sup> (Σ×A)<sup>ω</sup> of words with correct annotations for separation. Thus,

SepDBW(L1, L2) = {x ⊕ y : (x ∈ L<sup>1</sup> → y ∈ ∞acc) ∧ (x ∈ L<sup>2</sup> → y ∈ ∞acc)}.

Note that *comp*(SepDBW(L1, L2)) is then the language

NoSepDBW(L1, L2) = {x ⊕ y : (x ∈ L<sup>1</sup> ∧ y ∈ ∞acc) ∨ (x ∈ L<sup>2</sup> ∧ y ∈ ∞acc)}.

A *DBW-sep-refuter for* L1, L<sup>2</sup> is an (A/Σ)-transducer with ι = *env* that realizes NoSepDBW(L1, L2).

*Example 5.* Consider the language <sup>L</sup>°<sup>a</sup> = (<sup>a</sup> <sup>+</sup> <sup>b</sup>)<sup>∗</sup> · <sup>b</sup><sup>ω</sup>, which is not DBW. Let <sup>I</sup> <sup>=</sup> <sup>a</sup><sup>∗</sup> · <sup>b</sup><sup>ω</sup> <sup>+</sup> <sup>b</sup><sup>∗</sup> · <sup>a</sup><sup>ω</sup>, thus we are indifferent about words with only one alternation between a and b. In Figure 3 we describe a DBW-sep refuter for L°<sup>a</sup> \ I, *comp*(L°<sup>a</sup> ∪ I). Note that the refuter generates only words in a · <sup>b</sup> · <sup>a</sup> · (<sup>a</sup> <sup>+</sup> <sup>b</sup>)<sup>ω</sup>, whose intersection with <sup>I</sup> is empty. Consequently, the refutation is similar to the DBW-refutation of <sup>L</sup>°<sup>a</sup>.

**Figure 3.** A DBW-sep refuter for L°a \ I, comp(L°a <sup>∪</sup> <sup>I</sup>).

By Proposition 1, we have the following extension of Proposition 2.

**Proposition 3.** *Consider two languages* <sup>L</sup>1, L<sup>2</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup>*. Let* <sup>A</sup> <sup>=</sup> {acc, rej}*. Exactly one of the following holds:*


As for complexity, the construction of the game for SepDBW(L1, L2) is similar to the one described in Theorem 1. Here, however, the input to the problem includes two DPWs. Also, the positive case, namely the construction of the separator does not follow from known results.

**Theorem 6.** *Consider DPWs* A<sup>1</sup> *and* A<sup>2</sup> *with* n<sup>1</sup> *and* n<sup>2</sup> *states, respectively. Let* L<sup>1</sup> = L(A1) *and* L<sup>2</sup> = L(A2)*. One of the following holds.*


*Proof.* We show that SepDBW(L1, L2) and NoSepDBW(L1, L2) can be recognised by DRWs with at most 2 · n<sup>1</sup> · n<sup>2</sup> states. Then, by [15], we can construct a DBW or a DBW-sep-refuter with at most 2 · n<sup>1</sup> · n<sup>2</sup> states. The construction is similar to the one described in the proof of Theorem 1. The only technical challenge is the fact SepDBW(L1, L2) is defined as the intersection, rather than union, of two languages. For this, we observe that we can define SepDBW(L1, L2) also as {x ⊕ y : (y ∈ ∞acc and x /∈ L2) or (y /∈ ∞acc and x /∈ L1)}. With this formulation we then can reuse the union construction as seen in Theorem 1 to obtain DRWs with at most 2 · n<sup>1</sup> · n<sup>2</sup> states.

As has been the case with DBW-recognizability, one can generate certificates from a DBW-sep-refuter. The proof is similar to that of Theorem 3, with membership in L<sup>1</sup> replacing membership in L and membership in L<sup>2</sup> replacing being disjoint from L. Formally, we have the following.

**Theorem 7.** *Two* <sup>ω</sup>*-regular languages* <sup>L</sup>1, L<sup>2</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup> *are not DBW-separable iff there exist three finite words* <sup>x</sup> <sup>∈</sup> <sup>Σ</sup><sup>∗</sup> *and* <sup>x</sup>1, x<sup>2</sup> <sup>∈</sup> <sup>Σ</sup><sup>+</sup>*, such that* <sup>x</sup>·(x1+x2)<sup>∗</sup>·x<sup>ω</sup> <sup>1</sup> ⊆ L<sup>1</sup> *and* x · (x<sup>∗</sup> <sup>1</sup> · <sup>x</sup>2)<sup>ω</sup> <sup>⊆</sup> <sup>L</sup>2*.*

We refer to a triple x, x1, x<sup>2</sup> of words that satisfy the conditions in Theorem 7 as a *certificate* to the non-DBW-separability of L1, L<sup>2</sup>. Observe that the same way we generated a no-DBW witness in Remark 3, we can extract, given a DBWsep-refuter R for L1, L<sup>2</sup>, languages L <sup>1</sup> ⊆ L<sup>1</sup> and L <sup>2</sup> ⊆ L<sup>2</sup> that tighten L1, L<sup>2</sup> and are still not DBW-separable.

#### **4.2 Certificate-Guided Approximation**

In this section we describe a method for finding small approximating languages I<sup>↓</sup> and I<sup>↑</sup> such that L \ I↓, *comp*(L) \ I<sup>↑</sup> is DBW-separable. If this method terminates we obtain an approximation for L that is DBW-recognizable. As in *counterexample guided abstraction-refinement* (CEGAR) for model checking [10], we use certificates for non-DBW-separability in order to suggest interesting approximating languages. Intuitively, while in CEGAR the refined system excludes the counterexample, here the approximation of L excludes the certificate.

Consider a certificate x, x1, x<sup>2</sup> for the non-DBW-separability of L1, L<sup>2</sup>. We suggest the following five approximations:

$$\begin{array}{llll} C\_0 = x \cdot (x\_1 + x\_2)^\omega & \leadsto & \langle L\_1 \mid C\_0, L\_2 \mid C\_0 \rangle \\ C\_1 = x \cdot (x\_1 + x\_2)^\ast \cdot x\_1^\omega = L\_1 \cap C\_0 & \leadsto & \langle L\_1 \mid C\_1, L\_2 \rangle \\ C\_2 = x \cdot (x\_2^\ast \cdot x\_1)^\omega \supset C\_1 & \leadsto & \langle L\_1, L\_2 \backslash C\_2 \rangle \\ C\_3 = x \cdot (x\_1^\ast \cdot x\_2)^\omega = L\_2 \cap C\_0 & \leadsto & \langle L\_1, L\_2 \backslash C\_3 \rangle \\ C\_4 = x \cdot (x\_1 + x\_2)^\ast \cdot x\_2^\omega \subset C\_3 & \leadsto & \langle L\_1, L\_2 \backslash C\_4 \rangle \end{array}$$

First, it is easy to verify that x, x1, x<sup>2</sup> is indeed not a certificate for the non-DBW-separability of the obtained candidate pairs L 1, L <sup>2</sup>. If L 1, L <sup>2</sup> is DBWseparable, we are done (yet may try to tighten the approximation). Otherwise, we can repeat the process with a certificate for the non-DBW-separability of L 1, L <sup>2</sup>. As in CEGAR, some suggestions may be more interesting than others, in some cases the process terminates, in some it does not, and the user takes part directing the search.

*Example 6.* Consider again the language <sup>L</sup> = (<sup>a</sup> <sup>+</sup> <sup>b</sup>)<sup>∗</sup> · <sup>b</sup><sup>ω</sup> and the certificate x, x1, x<sup>2</sup> = , b, a. Trying to approximate L by a language in DBW, we start with the pair L, *comp*(L). Our five suggestions are then as follows.

$$\begin{array}{llll} C\_0 = \Sigma^{\omega} & \leadsto & \langle L \mid C\_0, comp(L) \mid C\_0 \rangle = \langle \emptyset, \emptyset \rangle \\ C\_1 = (b+a)^\* \cdot b^{\omega} & \leadsto & \langle L \mid C\_1, comp(L) \rangle = \langle \emptyset, comp(L) \rangle \\ C\_2 = (a^\* \cdot b)^{\omega} & \leadsto & \langle L, comp(L) \mid C\_2 \rangle = \langle L, (a+b)^\* \cdot a^{\omega} \rangle \\ C\_3 = (b^\* \cdot a)^{\omega} & \leadsto & \langle L, comp(L) \mid C\_3 \rangle = \langle L, \emptyset \rangle \\ C\_4 = (b+a)^\* \cdot a^{\omega} & \leadsto & \langle L, comp(L) \mid C\_4 \rangle = \langle L, (a+b)^\* \cdot (a \cdot a^\* \cdot b \cdot b^\*)^{\omega} \rangle \\ \end{array}$$

Candidates C0, C1, and C<sup>3</sup> induce trivial approximations. Then, C<sup>2</sup> suggests to over-approximate <sup>L</sup> by setting <sup>I</sup><sup>↑</sup> to (a<sup>∗</sup> · <sup>b</sup>)<sup>ω</sup>, which we view as a nice solution, approximating "eventually always b" by "infinitely often b". Then, the pair derived from C<sup>4</sup> is not DBW-separable. We can try to approximate it. Note, however, that repeated approximations in the spirit of C<sup>4</sup> are going to only extend the prefix of x in the certificates, and the process does not terminate. In the full version of this article [27], we describe the process for the certificate x, x1, x<sup>2</sup> <sup>=</sup> a, b, a, which again might not terminate.

### **5 Other Classes of Deterministic Automata**

In this section we generalise the idea of DBW-refuters to other classes of deterministic automata. For this we take again the view that a deterministic automaton is a Σ,A-transducer over a suitable annotation alphabet A. We then characterize each class of deterministic automata by two languages over A:


We now formalize this intuition. Let A be a finite set of annotations and let <sup>γ</sup> <sup>=</sup> <sup>L</sup>acc, Lstruct, for <sup>L</sup>acc, Lstruct <sup>⊆</sup> <sup>A</sup><sup>ω</sup>. A deterministic automaton <sup>A</sup> <sup>=</sup> Σ, Q, q0, δ, α is a deterministic γ automaton (DγW, for short) if there is a function τ : Q → A that maps each state to an annotation such that a run r of A satisfies α iff τ (r) ∈ Lacc, and all runs r satisfy the structural condition, thus τ (r) ∈ Lstruct. We then say that a language L is γ-recognizable if there a DγW A such that L = L(A).

Before we continue to study γ-recognizability, let us demonstrate the γcharacterization of common deterministic automata. We first start with classes γ for which Lstruct is trivial; i.e., Lstruct = A<sup>ω</sup>.


Note that the characterizations for B¨uchi, co-B¨uchi, and parity are special cases of the characterization for DELW. In a similar way, we could define a language Lacc for DRW[k] and other common special cases of DELWs. We continue to classes in the depth hierarchy, where γ includes also a structural restriction:


Let Σ be an alphabet, let A be an annotation alphabet, and let γ = Lacc, <sup>L</sup>struct, for <sup>L</sup>acc, Lstruct <sup>⊆</sup> <sup>A</sup><sup>ω</sup>. We define the language Real(L, γ) <sup>⊆</sup> (<sup>Σ</sup> <sup>×</sup> <sup>A</sup>)<sup>ω</sup> of words with correct annotations.

Real(L, γ) = {x ⊕ y : y ∈ Lstruct and (x ∈ L iff y ∈ Lacc)}.

Note that the language DBW(L) can be viewed as a special case of our general framework. In particular, in cases <sup>L</sup>struct <sup>=</sup> <sup>A</sup><sup>ω</sup>, we can remove the <sup>y</sup> <sup>∈</sup> <sup>L</sup>struct conjunct from Real(L, γ). Note that *comp*(Real(L, γ)) is the language

NoReal(L, γ) = {x ⊕ y : y ∈ Lstruct or (x ∈ L iff y ∈ Lacc)}.

A γ*-refuter for* L is then an (A/Σ)-transducer with ι = *env* that realizes NoReal(L, γ). We can now state the "DγW-generalization" of Proposition 2.

**Proposition 4.** *Consider an* <sup>ω</sup>*-regular language* <sup>L</sup> <sup>⊆</sup> <sup>Σ</sup><sup>ω</sup>*, and a pair* <sup>γ</sup> <sup>=</sup> <sup>L</sup>acc, Lstruct*, for* <sup>ω</sup>*-regular languages* <sup>L</sup>acc, Lstruct <sup>⊆</sup> <sup>A</sup><sup>ω</sup>*. Exactly one of the following holds:*


Note that every DELW can be complemented by dualization, thus by changing its acceptance condition from θ to ¬θ. In particular, DBW and DCW dualize each other. As we argue below, dualization is carried over to refutation. For example, the ({acc, rej}/Σ)-transducer R from Figure 1 is both a DBW-refuter for °a and a DCW-refuter for ∞a. Formally, we have the following.

**Theorem 8.** *Consider an EL-condition* θ *over* M*. Let* A = 2<sup>M</sup>*. For every* (A/Σ)*-transducer* R *and language* L*, we have that* R *is a* DELW[θ]*-refuter for* L *iff* R *is a* DELW[¬θ]*-refuter for comp*(L)*. In particular, for every language* L *and* ({acc, rej}/Σ)*-transducer* R*, we have that* R *is a DBW-refuter for* L *iff* R *is a DCW-refuter for comp*(L)*.*

*Proof.* For DELW[θ]-recognizability of L, the language of correct annotations is {x⊕y : (x ∈ L iff y |= θ)}, which is equal to {x⊕y : (x ∈ *comp*(L) iff y |= ¬θ)}, which is the language of correct annotations for DELW[¬θ]-recognizability of *comp*(L).

While dualization is nicely carried over to refutation, this is not the case for all expressiveness results. For example, while DWW=DBW∩DCW, and in fact DBW and DCW are weak type (that is, when the language of a DBW is in DWW, an equivalent DWW can be defined on top of its structure, and similarly for DCW [21]), we describe in [27] a DWW-refuter that is neither a DBW- nor a DCW-refuter. Intuitively, this is possible as in DWW refutation, Prover loses when the input is not in <sup>A</sup><sup>∗</sup> · (acc<sup>ω</sup> <sup>+</sup> rej<sup>ω</sup>), whereas in DBW and DCW refutation, Refuter has to respond correctly also for these inputs.

On the other hand, as every DWW is also a DBW and a DCW, every DBWrefuter or DCW-refuter is also a DWW-refuter.

It is easy to see that our results about DγW-recognizability can be extended to separability and approximation in the same way DBW-recognizability has been extended in Section 4. We describe the details in the full version [27], as well as word-certificates for the non-DγW-recognizability and -separability of several well-known types of γ.

#### **6 Discussion and Directions for Future Research**

The automation of decision procedures makes certification essential. We suggest to use the winning strategy of the refuter in expressiveness games as a certificate to inexpressibility. We show that beyond this *state-based certificate*, the strategy induces a *word-based certificate*, generated from words traversed along a "flower structure" the strategy contains, as well as a *language-based certificate*, consisting of languages that under- and over-approximate the language in question and that are not separable by automata in the desired class.

While our work considers *expressive power*, one can use similar ideas in order to question the *size* of automata needed to recognize a given language. For example, in the case of a regular language L of finite words, the Myhill-Nerode characterization [37,38] suggests to refute the existence of deterministic finite word automata (DFW) with n states for L by providing n + 1 prefixes that are not right-congruent. Using our approach, one can alternatively consider the winning strategy of Refuter in a game in which the set of annotations includes also the state space, and Lstruct ensures consistency of the transition relation. Even more interesting is refutation of size in the setting of automata on infinite words. Indeed, there, minimization is NP-complete [46], and there are interesting connections between polynomial certificates and possible membership in co-NP, as well as connections between size of certificates and succinctness of the different classes of automata.

Finally, while the approximation scheme we studied is based on suggested over- and under-approximating languages, it is interesting to study approximations that are based on more flexible distance measures [13,18].

### **References**


for Verification and Analysis. Lecture Notes in Computer Science, vol. 10482, pp. 67–83. Springer (2017)


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **A General Semantic Construction of Dependent Refinement Type Systems, Categorically**

Satoshi Kura<sup>1</sup>,<sup>2</sup> -

<sup>1</sup> National Institute of Informatics, Tokyo, Japan <sup>2</sup> The Graduate University for Advanced Studies (SOKENDAI), Kanagawa, Japan kura@nii.ac.jp

**Abstract.** Dependent refinement types are types equipped with predicates that specify preconditions and postconditions of underlying functional languages. We propose a general semantic construction of dependent refinement type systems from underlying type systems and predicate logic, that is, a construction of liftings of closed comprehension categories from given (underlying) closed comprehension categories and posetal fibrations for predicate logic. We give sufficient conditions to lift structures such as dependent products, dependent sums, computational effects, and recursion from the underlying type systems to dependent refinement type systems. We demonstrate the usage of our construction by giving semantics to a dependent refinement type system and proving soundness.

#### **1 Introduction**

Dependent refinement types [6] are types equipped with predicates that restrict values in the types. They are used to specify preconditions and postconditions which may depend on input values and to verify that programs satisfy the specifications. Many dependent refinement types systems are proposed [5,6,13,14,25] and implemented in, e.g., F [23, 24] and LiquidHaskell [19, 26, 27].

In this paper, we address the question: "How are dependent refinement type systems, underlying type systems, and predicate logic related from the viewpoint of categorical semantics?" Although most existing dependent refinement type systems are proved to be sound using operational semantics, we believe that categorical semantics is more suitable for the general understanding of their nature, especially when we consider general computational effects and various kinds of predicate logic (e.g., for relational verification). This understanding will provide guidelines to design new dependent refinement type systems.

Our answer to the question is a general semantic construction of dependent refinement type systems from underlying type systems and predicate logic. More concretely, given a closed comprehension category (CCompC for short) for interpreting an underlying type system and a fibration for predicate logic, we combine them to obtain another CCompC that can interpret a dependent refinement type system built from the underlying type system and the predicate logic.

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 406–426, 2021.

https://doi.org/10.1007/978-3-030-71995-1 21

For example, consider giving an interpretation to the term "x : {int | x ≥ 0} x +1: {v : int | v = x + 1}" in a dependent refinement type system. Its underlying term is "x : int x + 1 : int," and we assume that it is interpreted as the successor function of Z in **Set**. The problem here is how to refine this interpretation with predicates. In dependent refinement types, predicates may depend on the variables in contexts. In this example, the type "x : {int | x ≥ 0}{v : int | v = x + 1}" depends on the variable x. Thus, the interpretation of such types must be a predicate on the context and the type, i.e.,

$$\{x : \{\text{int } | \, x \ge 0\} \vdash \{v : \text{int } | \, v = x + 1\}\} = \{ (x, v) \in \mathbb{Z} \times \mathbb{Z} \mid x \ge 0 \land v = x + 1\}.\newline\qquad \{ \text{int } | \, v \ge 0 \land \text{int } | \, v < 0 \land v = x + 1\} \equiv \{ \text{int } | \, v < 0 \land \text{int } | \, v < 0 \land \text{int } | \, v < 0 \land v = x + 1\} \equiv \{ \text{int } | \, v < 0 \land \text{int } | \, v < 0 \land \text{int } | \, v < 0 \land \text{int } | \, v < 0 \land \text{int } | \, v < 0 \land \text{int } | \, v < 0$$

As a result, the term in the dependent refinement type system is interpreted as the interpretation in the underlying type system together with the property that if the input satisfies preconditions, then the output satisfies postconditions.

$$\begin{array}{c} \{x \in \mathbb{Z} \mid x \ge 0\} \dashv \cdots \dashv \{ (x, v) \in \mathbb{Z} \times \mathbb{Z} \mid x \ge 0 \land v = x + 1 \} \\ \stackrel{\scriptstyle \sqcap}{\mathbb{Z}} \underline{\qquad} \qquad \stackrel{\scriptstyle \bigcirc}{\langle \operatorname{id}\_{\mathbb{Z}}, (-) + 1 \rangle} \stackrel{\scriptstyle \bigcirc}{\longrightarrow} \mathbb{Z} \times \mathbb{Z} \end{array} \tag{1}$$

We formalize this refinement process as a construction of liftings of CCompCs, which are used to interpret dependent type theories. Assume that we have a pair of a CCompC <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> for interpreting underlying type systems and a fibration <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup> for predicate logic satisfying certain con-

ditions. Then we construct a CCompC {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup> for interpreting dependent refinement type systems. This construction also yields a morphism of CCompCs from {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup> to <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> in Fig. 1. Given the simple fibration **<sup>s</sup>**(**Set**) <sup>→</sup> **Set** for underlying type systems and the subobject fibration **Sub**(**Set**) → **Set** for predicate logic, then we get interpretations like (1).

We extend the construction of liftings of CCompCs to liftings of fibred monads [1] on CCompCs, which is motivated by the fact that many dependent refinement type systems have computational effects, e.g., exception (like division and assertion), divergence, nondeterminism [25], and probability [5]. Assume that we have a fibred monad <sup>T</sup><sup>ˆ</sup> on <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup>, a monad <sup>T</sup> on <sup>B</sup>, and a lifting <sup>T</sup>˙ of <sup>T</sup> along <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup>. Under a certain condition that roughly claims that <sup>T</sup><sup>ˆ</sup> and <sup>T</sup> represent the same computational effects, we construct a fibred monad on {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup>, which is a lifting of Tˆ in the same spirit of the given lifting T˙ . This situation is rather realistic because the fibred monad <sup>T</sup><sup>ˆ</sup> on the CCompC <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> is often induced from the monad T on the base category B. The lifting T˙ of the monad <sup>T</sup> along <sup>p</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup> specifies how to map predicates <sup>P</sup> <sup>∈</sup> <sup>P</sup><sup>X</sup> on values <sup>X</sup> <sup>∈</sup> <sup>B</sup> to predicates T P˙ <sup>∈</sup> <sup>P</sup>TX on computations T X, which enables us to express, for example, total/partial correctness and may/must nondeterminism [1].

We explain the usage of these categorical constructions by giving semantics to a dependent refinement type system with computational effects, which is based on [4]. Our system also supports subtyping relations induced by logical implication. We prove soundness of the dependent refinement type system.

Finally, we discuss how to handle recursion in dependent refinement type systems. In [4], Ahman gives semantics to recursion in a specific model, i.e., the fibration of continuous families of ω-cpos **CFam**(**CPO**) → **CPO**. We consider more general characterization of recursion by adapting Conway operators for CCompCs, which enables us to lift the structure for recursion. We show that a rule for partial correctness in our dependent refinement type system is sound under the existence of a generalized Conway operator.

Our contributions are summarized as follows.


### **2 Preliminaries**

We review basic definitions and fix notations for comprehension categories, which are used as categorical models for dependent type theories. We assume basic knowledge of fibrations (see e.g. [10]).

Let <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> be a fibration (opfibration). We denote the cartesian (cocartesian) lifting over u : I → J by u(Y ) : u∗Y → Y (u(X) : X → u!X) where <sup>u</sup><sup>∗</sup> : <sup>E</sup><sup>J</sup> <sup>→</sup> <sup>E</sup><sup>I</sup> (u! : <sup>E</sup><sup>I</sup> <sup>→</sup> <sup>E</sup><sup>J</sup> ) is the reindexing (coreindexing) functor. We call <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> <sup>a</sup> *posetal fibration* if <sup>p</sup> is a fibration such that each fibre category is a poset. Note that the fibration <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> is split and faithful if <sup>p</sup> is posetal.

<sup>A</sup> *comprehension category* is a functor <sup>P</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup><sup>→</sup> such that the composite cod ◦ P : <sup>E</sup> <sup>→</sup> <sup>B</sup> is a fibration and <sup>P</sup> maps cartesian morphisms to pullbacks in <sup>B</sup>. A comprehension category <sup>P</sup> is *full* if <sup>P</sup> is fully faithful.

<sup>A</sup> *comprehension category with unit* is a fibration <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> that has a fibred terminal object 1 : <sup>B</sup> <sup>→</sup> <sup>E</sup> and a comprehension functor {−} : <sup>E</sup> <sup>→</sup> <sup>B</sup> which is a right adjoint of the fibred terminal object functor 1 - {−}. Projection π<sup>X</sup> : {X} → pX is defined by π<sup>X</sup> = p 1{−} <sup>X</sup> for each <sup>X</sup> <sup>∈</sup> <sup>E</sup>. Intuitively, <sup>E</sup> represents a collection of types <sup>Γ</sup> <sup>A</sup> in dependent type theories; <sup>B</sup> represents a collection of contexts <sup>Γ</sup>; <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> is the mapping (<sup>Γ</sup> <sup>A</sup>) <sup>→</sup> <sup>Γ</sup>;1: <sup>B</sup> <sup>→</sup> <sup>E</sup> is the unit type Γ → (Γ 1); and {−} is the mapping (Γ A) → Γ, x : A where x is a fresh variable.

The comprehension category with unit <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> induces several structures. It induces a comprehension category P defined by PX = πX. The adjunction <sup>1</sup> - {−} defines the bijection <sup>s</sup> : <sup>E</sup><sup>I</sup> (1I,X) <sup>∼</sup><sup>=</sup> {<sup>f</sup> : <sup>I</sup> → {X} | <sup>π</sup><sup>X</sup> ◦ <sup>f</sup> = id<sup>I</sup> } between vertical morphisms in <sup>E</sup> and sections in <sup>B</sup>. For each X, Y <sup>∈</sup> <sup>E</sup><sup>I</sup> , we have an isomorphism <sup>φ</sup> : <sup>E</sup>{X}(1{X}, π<sup>∗</sup> <sup>X</sup>Y ) ∼= E<sup>I</sup> (X, Y ). Consider the pullback square <sup>P</sup>(πX(<sup>Y</sup> )) where X, Y <sup>∈</sup> <sup>E</sup><sup>I</sup> . By the universal property of pullbacks, we have the symmetry isomorphism σX,Y : {π<sup>∗</sup> <sup>X</sup>Y }→{π<sup>∗</sup> <sup>Y</sup> X} as a unique morphism σX,Y such that π<sup>π</sup><sup>∗</sup> <sup>X</sup><sup>Y</sup> = {π<sup>Y</sup> (X)} ◦ σX,Y and {πX(Y )} = π<sup>π</sup><sup>∗</sup> <sup>Y</sup> <sup>X</sup> ◦ σX,Y . Similarly, we have the diagonal morphism δ<sup>X</sup> : {X}→{π<sup>∗</sup> <sup>X</sup>X} as a unique morphism δ<sup>X</sup> such that π<sup>π</sup><sup>∗</sup> <sup>X</sup><sup>X</sup> ◦ δ<sup>X</sup> = {πX(X)} ◦ δ<sup>X</sup> = id{X}.

Let <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> be a comprehension category with unit and <sup>q</sup> : <sup>D</sup> <sup>→</sup> <sup>B</sup> be a fibration. The fibration q has p*-products* if π<sup>∗</sup> <sup>X</sup> : <sup>D</sup>pX <sup>→</sup> <sup>D</sup>{X} has a right adjoint π<sup>∗</sup> <sup>X</sup> - <sup>X</sup> for each <sup>X</sup> <sup>∈</sup> <sup>E</sup> and these adjunctions satisfy the BC (Beck-Chevalley) condition for each pullback square Pf where P is a comprehension category induced by p and f is a cartesian morphism in E. Similarly, we define p*-coproducts* by <sup>X</sup> - π<sup>∗</sup> <sup>X</sup> and p-equality by Eq<sup>X</sup> - δ<sup>∗</sup> <sup>X</sup> plus the BC condition for each cartesian morphism (see [10, Definition 9.3.5] for detail).

A comprehension category with unit <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> admits *products* (*coproducts*) if it has p-products (p-coproducts). The coproducts are *strong* if the canonical morphism κ : {Y }→{ <sup>X</sup> Y } defined by {πX( <sup>X</sup> <sup>Y</sup> ) ◦ <sup>η</sup><sup>π</sup><sup>∗</sup> X <sup>X</sup> } is an isomorphism for each <sup>X</sup> <sup>∈</sup> <sup>E</sup> and <sup>Y</sup> <sup>∈</sup> <sup>E</sup>{X}. A *closed comprehension category* (CCompC) is a full comprehension category with unit that admits products and strong coproducts and has a terminal object in the base category. A *split closed comprehension category* (SCCompC) is a CCompC such that p is a split fibration, and the BC condition for products and coproducts holds strictly (i.e., canonical isomorphisms are identities). For example, the simple fibration <sup>s</sup><sup>B</sup> : **<sup>s</sup>**(B) <sup>→</sup> <sup>B</sup> on a cartesian closed category B is a SCCompC (see [10, Theorem 10.5.5]). Another example of SCCompCs is the family fibration fam**Set** : **Fam**(**Set**) → **Set**.

Fibred coproducts in a comprehension category with unit <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> are *strong* if the functor {ι1}<sup>∗</sup>, {ι2}<sup>∗</sup> : <sup>E</sup>{X+<sup>Y</sup> } <sup>→</sup> <sup>E</sup>{X} <sup>×</sup> <sup>E</sup>{<sup>Y</sup> } is fully faithful where ι<sup>1</sup> : X → X + Y and ι<sup>2</sup> : Y → X + Y are injections for fibred coproducts. Strong fibred coproducts are used to interpret fibred coproduct types A + B.

#### **3 Lifting SCCompCs and Fibred Coproducts**

In this section, we give a construction of liftings of SCCompCs with strong fibred coproducts from given SCCompCs with strong fibred coproducts for underlying types and posetal fibrations for predicate logic satisfying appropriate conditions.

#### **3.1 Lifting SCCompCs**

Let <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> be a SCCompC for underlying type systems. Let <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup> be a posetal fibration with fibred finite products for predicate logic.

**Definition 1.** We define a category {<sup>E</sup> <sup>|</sup> <sup>P</sup>} by the pullback of <sup>q</sup><sup>→</sup> : <sup>P</sup><sup>→</sup> <sup>→</sup> <sup>B</sup><sup>→</sup> along <sup>P</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup><sup>→</sup> where the comprehension category <sup>P</sup> is induced by <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup>.

$$\begin{array}{c} \{\mathbb{E}\mid\mathbb{P}\} \xrightarrow{(q^{\rightarrow})^{\*}\mathcal{P}} \mathbb{P}\xrightarrow{\sim} \\ \mathcal{P}^{\*}(q^{\rightarrow})\downarrow \quad \xrightarrow{\sim} \quad \downarrow^{q^{\rightarrow}} \\ \mathbb{E} \xrightarrow{\mathcal{P}} \xrightarrow{\mathcal{P}} \mathbb{B}\xrightarrow{} \end{array}$$

That is, objects are tuples (X, P, Q) where <sup>X</sup> <sup>∈</sup> <sup>E</sup>, <sup>P</sup> <sup>∈</sup> <sup>P</sup>pX, <sup>Q</sup> <sup>∈</sup> <sup>P</sup>{X}, and Q ≤ π<sup>∗</sup> <sup>X</sup>P; and morphisms are tuples (f, g, h):(X, P, Q) → (X , P , Q ) where f : X → X , g : P → P , h : Q → Q , pf = qg, and {f} = qh.

The intuition of this definition is as follows. For each object (X, P, Q) ∈ {<sup>E</sup> <sup>|</sup> <sup>P</sup>}, X represents a type Γ A in the underlying type system, P represents a predicate on the context Γ, and Q represents the conjunction of a predicate on Γ, v : A and the predicate P (thus Q ≤ π<sup>∗</sup> <sup>X</sup>P is imposed). Note that <sup>P</sup><sup>∗</sup>(q→) : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>E</sup> is faithful because <sup>q</sup> is faithful.

Let {<sup>p</sup> <sup>|</sup> <sup>q</sup>} : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup> be a functor defined by cod ◦ (q→)<sup>∗</sup>P, that is, (X, P, Q) → P. The functor {p | q} inherits most of the CCompC structure of <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup>.

**Lemma 2.** *The functor* {<sup>p</sup> <sup>|</sup> <sup>q</sup>} : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup> *is a split fibration. The cartesian lifting of* g : P → P *is given by*

$$(\overline{qg}(X), g, \overline{\{\overline{qg}(X)\}}(Q) \circ \pi') : ((qg)^\*X, P', \pi^\*\_{(qg)^\*X}P' \wedge \{\overline{qg}(X)\}^\*Q) \to (X, P, Q)$$

*where* π *is a projection for fibred products.*

**Lemma 3.** *The fibration* {<sup>p</sup> <sup>|</sup> <sup>q</sup>} : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup> *is a full comprehension category with unit that admits strong coproducts.*

*Proof.* The main idea is that the structure in the CCompC <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> can be lifted to {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup>. Here, we only show the definition of (object parts of) fibred terminal objects 1 : <sup>P</sup> → {<sup>E</sup> <sup>|</sup> <sup>P</sup>}, the comprehension functor {−} : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup>, and coproducts (X,P,Q) : {<sup>E</sup> <sup>|</sup> <sup>P</sup>}<sup>Q</sup> → {<sup>E</sup> <sup>|</sup> <sup>P</sup>}<sup>P</sup> for each (X, P, Q) ∈ {<sup>E</sup> <sup>|</sup> <sup>P</sup>}.

$$11P = (1qP, P, \pi\_{1qP}^\*P) \quad \{(X, P, Q)\} = Q \prod\_{(X, P, Q)} (Y, Q, R) = \left(\coprod\_X Y, P, (\kappa^{-1})^\*R\right)$$

The rest of the proof is omitted.

The existence of products in {p | q} requires additional conditions.

**Lemma 4.** *If* <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup> *has fibred exponentials and* <sup>p</sup>*-products (in addition to fibred finite products), then* {<sup>p</sup> <sup>|</sup> <sup>q</sup>} : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup> *admits products.*

*Proof.* We define (X,P,Q) : {<sup>E</sup> <sup>|</sup> <sup>P</sup>}<sup>Q</sup> → {<sup>E</sup> <sup>|</sup> <sup>P</sup>}<sup>P</sup> by

$$\prod\_{(X,P,Q)} \langle Y, Q, R \rangle = (\prod\_X Y, P, \pi^\*\_{\Pi X} \cdot\_Y P \wedge \prod\_{\pi^\*\_{\Pi X} \cdot Y, X} \sigma^\*\_{\Pi\_X Y, X} (\pi^\*\_{\pi^\*\_X \cdot \Pi X} \cdot\_Y Q \Rightarrow \{\epsilon^{\pi^\*\_X \cdots \dagger \Pi X} \}^\* R)).$$

$$\begin{aligned} Q \in \mathbb{P}\_{\{X\}} & \xleftarrow{\pi^\*\_{\pi^\*\_X} \Pi\_X} \mathbb{P}\_{\{\pi^\*\_X \Pi\_X \, Y\}} \xrightarrow[{\{\pi^\*\_X \Pi\_X \, Y\}}^\*]{} \mathbb{P}\_{\{\pi^\*\_X \Pi\_X \, Y\}} \xrightarrow[]{\sigma^\*\_{\Pi\_X \, Y, X}} \mathbb{P}\_{\{\pi^\*\_{\Pi\_X \, Y} X\}} \xleftarrow{\prod\_{\pi^\*\_{\Pi\_X} \, Y}} \mathbb{P}\_{\{\Pi\_X \, Y\}} \end{aligned}$$

Then, this gives products in {p | q} but we omit the lengthy proof.

As a result, we get a lifting of SCCompCs over <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup>.

{<sup>E</sup> <sup>|</sup> <sup>P</sup>} <sup>E</sup> P B {p|q} <sup>P</sup>∗(q→) q **Theorem 5.** *If* <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> *is a SCCompC and* <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup> *is a fibred ccc that has* <sup>p</sup>*-products, then* {<sup>p</sup> <sup>|</sup> <sup>q</sup>} : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup> *is a SCCompC. Moreover,* (P<sup>∗</sup>(q→), q) : {<sup>p</sup> <sup>|</sup> <sup>q</sup>} → <sup>p</sup> *is a morphism of SCCompCs, i.e., a split fibred functor that preserves the CCompC structure strictly.*

*Proof.* By Lemma 3 and Lemma 4. A terminal object in P exists because B has a terminal object and <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup> has fibred terminal objects. It is almost obvious that (P<sup>∗</sup>(q→), q) preserves the structure of CCompCs.

**Example 6.** Consider the simple fibration s**Set** : **s**(**Set**) → **Set** and the subobject fibration sub**Set** : **Sub**(**Set**) → **Set** (see [10, §1.3]). Objects in {**s**(**Set**) | **Sub**(**Set**)} are tuples ((I,X), P, Q) where (I,X) ∈ **s**(**Set**), P ⊆ I, and Q ⊆ P × X ⊆ I × X, and morphisms are those in **s**(**Set**) that preserve predicates. In {s**Set** | sub**Set**} : {**s**(**Set**) | **Sub**(**Set**)} → **Sub**(**Set**), products are given by

$$\prod\_{\{(I,X),P,Q\}} ((I \times X, Y), Q, R) = \{(I, X \Rightarrow Y), P, \{(i, f) \in I \times (X \Rightarrow Y) \mid \}$$

$$i \in P \land \forall x \in X, (i, x) \in Q \implies \{(i, x), f(x) \in R\}. \tag{2}$$

**Example 7.** Let erel : **ERel** → **Set** be the fibration of endorelations defined by change-of-base from **Sub**(**Set**) → **Set** along the functor X → X × X. The fibration erel is a fibred ccc and has products (i.e. right adjoints of reindexing functors that satisfy the BC condition for each pullback square). Therefore, erel has p-products for any comprehension category with unit p. If we apply Theorem 5 to erel and the simple fibration s**Set** : **s**(**Set**) → **Set**, then products are defined similarly to Example 6.

**Example 8.** Consider the family fibration fam**Set** : **Fam**(**Set**) → **Set** [10, Def 1.2.1] and the subobject fibration sub**Set** : **Sub**(**Set**) → **Set**. Objects in {**Fam**(**Set**) | **Sub**(**Set**)} are tuples ((I,X), P, Q) where (I,X) ∈ **Fam**(**Set**), P ⊆ I, and Q ⊆ <sup>i</sup>∈<sup>P</sup> Xi <sup>⊆</sup> <sup>i</sup>∈<sup>I</sup> Xi. Note that subsets <sup>Q</sup> <sup>⊆</sup> <sup>i</sup>∈<sup>I</sup> Xi have a one-to-one correspondence with families of subsets (Qi ⊆ Xi)<sup>i</sup>∈<sup>I</sup> when we define Qi = ι ∗ <sup>i</sup> (Q) where ι<sup>i</sup> : Xi → <sup>i</sup>∈<sup>I</sup> Xi is the <sup>i</sup>-th injection. So, we often identify Q with the family of subsets Qi ⊆ Xi. We get products in {fam**Set** | sub**Set**} : {**Fam**(**Set**) | **Sub**(**Set**)} → **Sub**(**Set**) by modifying (2) for dependent functions.

#### **3.2 Lifting Fibred Coproducts**

A sufficient condition for {<sup>p</sup> <sup>|</sup> <sup>q</sup>} : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup> to have strong fibred coproducts is given by the following lemma, which is analogous to [9, Prop. 4.5.8].

**Lemma 9.** *If (1)* <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> *is a CCompC that has strong fibred coproducts (2) for each* X, Y <sup>∈</sup> <sup>E</sup><sup>I</sup> *,* <sup>X</sup> , Y <sup>∈</sup> <sup>E</sup><sup>I</sup>- *,* u : I → I *, and pair of cartesian liftings* f : X → X *and* g : Y → Y *over* u*, the following two squares are pullbacks*

p

$$\begin{array}{c} \{X\} \xrightarrow{\{\iota\_{1}\}} \{X+Y\} \xleftarrow{\{\iota\_{2}\}} \xleftarrow{\{\iota\_{2}\}}\\ \{f\} \downarrow \end{array} \begin{array}{c} \{X+Y\} \xleftarrow{\{\iota\_{2}\}} \end{array} \begin{array}{c} \{Y\} \\ \downarrow \\ \{g\} \end{array} \begin{array}{c} \{X\} \\ \downarrow \\ \{g\} \end{array} \begin{array}{c} \{X\} \\ \downarrow \\ \{g\} \end{array} \end{array}$$

*(3)* <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup> *is a fibred distributive category (4) for each* X, Y <sup>∈</sup> <sup>E</sup><sup>I</sup> *and* <sup>Z</sup> <sup>∈</sup> <sup>E</sup>{X+<sup>Y</sup> }*,* <sup>q</sup> *has cocartesian liftings of* {ι1} : {X}→{<sup>X</sup> <sup>+</sup> <sup>Y</sup> }*,* {ι2} : {<sup>Y</sup> } → {X + Y }*,* {{ι1}(Z)} : {{ι1}<sup>∗</sup>Z}→{Z}*, and* {{ι2}(Z)} : {{ι2}<sup>∗</sup>Z}→{Z} *that satisfy the BC condition for each pullback squares and Frobenius, then* {p | q} : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup> *has strong fibred coproducts, and the fibred functor* (P<sup>∗</sup>(q→), q) : {p | q} → p *strictly preserves fibred coproducts.*

*Proof.* We define fibred coproducts by (X, P, Q)+(Y, P, R)=(X +Y, P, {ι1}!Q∨ {ι2}!R). We omit the rest of the proof.

Note that if q is fibred bicartesian closed, then q is a fibred distributive category.

**Example 10.** Consider s**Set** : **s**(**Set**) → **Set** and sub**Set** : **Sub**(**Set**) → **Set** (recall Example 6). This combination satisfies four conditions in Lemma 9. Fibred coproducts in {**s**(**Set**) | **Sub**(**Set**)} → **Sub**(**Set**) are defined as follows.

((I,X), P, Q) + ((I,Y ), P, R) = ((I,X + Y ), P, {(i, x) | (i, x) ∈ Q ∨ (i, x) ∈ R})

### **4 Lifting Monads on SCCompCs**

Suppose we have a SCCompC <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> and a posetal fibration <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup> as ingredients for {<sup>p</sup> <sup>|</sup> <sup>q</sup>} : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup> in Theorem 5. We explain how to construct a fibred monad on {<sup>p</sup> <sup>|</sup> <sup>q</sup>} : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup> from monads on <sup>p</sup> and <sup>q</sup>.

First, we assume that a monad <sup>T</sup> on <sup>B</sup> and a fibred monad <sup>T</sup><sup>ˆ</sup> on <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> are given. These monads are intended to represent the same computational effects in underlying type systems, but T is more "primitive" than Tˆ, and Tˆ is induced from T in some natural way. For example, we can use the maybe monad or the powerset monad on **Set** as <sup>T</sup> and define <sup>T</sup><sup>ˆ</sup> by (I,X) <sup>→</sup> (I,TX) on the simple fibration **s**(**Set**) → **Set**. In such a situation, we often have an oplax monad morphism (Definition 11) <sup>θ</sup> : {Tˆ(−)} → <sup>T</sup>{−}. Intuitively, <sup>θ</sup> extends the action of Tˆ on types to contexts, just like strengths of strong monads. We also need a lifting <sup>T</sup>˙ of <sup>T</sup> along <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup> to specify a mapping from predicates on values in <sup>X</sup> <sup>∈</sup> <sup>B</sup> to predicates on computations in T X [1]. Given all these ingredients and some additional conditions, we define a fibred monad on {<sup>p</sup> <sup>|</sup> <sup>q</sup>} : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup>, which is a lifting of the fibred monad <sup>T</sup><sup>ˆ</sup> on <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup>.

**Definition 11 (oplax monad morphism).** Let <sup>C</sup>, <sup>D</sup> be categories, <sup>F</sup> : <sup>C</sup> <sup>→</sup> <sup>D</sup> be a functor, and (S, η<sup>S</sup>, μ<sup>S</sup>), (T,η<sup>T</sup> , μ<sup>T</sup> ) be monads on C and D, respectively. A natural transformation θ : F S → T F is an *oplax monad morphism* if θ respects units and multiplications.

$$\begin{array}{c} F\_{\eta\_X^S} \begin{array}{c} FX\\ \eta\_X^S \end{array} \begin{array}{c} \eta\_{FX}^T\\ \eta\_X^S \end{array} \begin{array}{c} FS^2X \\ \hline FFX \end{array} \begin{array}{c} TFSX \ \xrightarrow{T\theta\_X} TFSX \ \xrightarrow{T\theta\_X} T^2FX\\ \downarrow \mu\_{FX}^T \end{array} \begin{array}{c} \downarrow \mu\_{FX}^T \end{array}$$

**Theorem 12.** *Let* <sup>T</sup> *be a monad on* <sup>B</sup>*,* <sup>T</sup><sup>ˆ</sup> *be a fibred monad on* <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> *in the 2-category* **Fib**<sup>B</sup> *of fibrations over* <sup>B</sup>*,* <sup>θ</sup> : {Tˆ(−)} → <sup>T</sup>{−} *be an oplax monad morphism, and* <sup>T</sup>˙ *be a fibred lifting [1] of* <sup>T</sup> *along* <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup>*. If*

$$
\pi\_{\dot{T}X}^\* P \wedge \theta\_X^\* \dot{T} Q \le \theta\_X^\* \dot{T} (\pi\_X^\* P \wedge Q) \tag{3}
$$

*holds for each* <sup>X</sup> <sup>∈</sup> <sup>E</sup>*,* <sup>P</sup> <sup>∈</sup> <sup>P</sup>pX *and* <sup>Q</sup> <sup>∈</sup> <sup>P</sup>{X}*, then there exists a fibred monad* <sup>S</sup> *on* {<sup>p</sup> <sup>|</sup> <sup>q</sup>} : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup> *in* **Fib**<sup>P</sup> *such that the fibred functor* {<sup>p</sup> <sup>|</sup> <sup>q</sup>} → <sup>p</sup> *in Theorem 5 is a fibred monad morphism from* S *to* Tˆ*.*

*Proof.* We define S(X, P, Q)=(TX, P, π ˆ <sup>∗</sup> TX<sup>ˆ</sup> <sup>P</sup> <sup>∧</sup> <sup>θ</sup>∗T Q˙ ). Then the monad structure of Tˆ lifts to S. The assumption (3) is required to prove that S is fibred.

$$\begin{array}{ccc} \mathbb{P} & \theta^\* \dot{T} Q \xrightarrow{\overline{\theta}(\dot{T}Q)} \dot{T} Q\\ \mathbb{B} & \{\hat{T}X\} \xrightarrow{\theta} T\{X\} & \Box \end{array}$$

**Example 13.** Any strong monad T on a CCC B gives rise to a split fibred monad <sup>T</sup><sup>ˆ</sup> on the simple fibration <sup>s</sup><sup>B</sup> : **<sup>s</sup>**(B) <sup>→</sup> <sup>B</sup> (actually, there is a one-to-one correspondence [10, Ex.2.6.10]). The monad <sup>T</sup><sup>ˆ</sup> is defined by (I,X) <sup>→</sup> (I,TX). An oplax monad morphism θ : I × T X → T(I × X) is given by the strength.

Now consider the case where B = **Set**. Since the strength for the monad T on **Set** is given uniquely [17, Proposition 3.4], we can prove that (3) holds for any fibred lifting of T along the subobject fibration sub**Set** : **Sub**(**Set**) → **Set**.

Let T be the maybe monad (−) + {∗}. There are two fibred liftings of T:

$$\dot{T\_1}(P \subseteq I) = (P + \{\ast\} \subseteq I + \{\ast\}) \qquad \dot{T\_2}(P \subseteq I) = (P \subseteq I + \{\ast\})$$

for each (<sup>P</sup> <sup>⊆</sup> <sup>I</sup>) <sup>∈</sup> **Sub**(**Set**). The lifting <sup>T</sup>˙ <sup>1</sup> corresponds to partial correctness, and T˙ <sup>2</sup> corresponds to total correctness. The fibred monads on {s**Set** | sub**Set**} defined in Theorem 12 from T˙ <sup>1</sup> and T˙ <sup>2</sup> are given by

$$\begin{aligned} &\left( (I,X),P,Q \right) \mapsto \left( (I,X+\{\*\}),P,\{ (i,x) \mid (i \in P \land x=\* ) \lor (i,x) \in Q \} \right) \\ &\left( (I,X),P,Q \right) \mapsto \left( (I,X+\{\*\}),P,\{ (i,x) \mid (i,x) \in Q \} \right) \end{aligned}$$

respectively. Here, we leave the left/right injection of coproducts implicit.

**Example 14.** For each monad T on **Set**, we have a split fibred monad on the family fibration **Fam**(**Set**) <sup>→</sup> **Set** defined by <sup>T</sup>ˆ(I,X)=(I,T ◦ <sup>X</sup>). We have an oplax monad morphism θ : <sup>i</sup>∈<sup>I</sup> TXi <sup>→</sup> <sup>T</sup> <sup>i</sup>∈<sup>I</sup> Xi defined by the cotupling [(T ιi)<sup>i</sup>∈<sup>I</sup> ] : <sup>i</sup>∈<sup>I</sup> TXi <sup>→</sup> <sup>T</sup> <sup>i</sup>∈<sup>I</sup> Xi where <sup>ι</sup><sup>i</sup> : Xi <sup>→</sup> <sup>i</sup>∈<sup>I</sup> Xi is the i-th injection. The condition (3) holds for any fibred lifting of T along the subobject fibration **Sub**(**Set**) → **Set**. Moreover, we have ι ∗ <sup>i</sup> θ∗T Q˙ = T ι˙ <sup>∗</sup> <sup>i</sup> Q for each Q ∈ **Sub**(**Set**) <sup>i</sup>∈<sup>I</sup> Xi, so the monad in Theorem <sup>12</sup> is given by

$$\left( (I, X), P, (Qi \subseteq Xi)\_{i \in I} \right) \mapsto \left( (I, T \circ X), P, (\dot{T}Qi \subseteq TX)\_{i \in I} \right).$$

### **5 Soundness**

We consider a concrete dependent refinement type system with computational effects and define sound semantics to show that the SCCompC defined in Theorem 5 has sufficient structures for dependent refinement types. Here, we consider two type systems. One is an underlying type system that is a fragment of EMLTT [2–4]. The other is a refinement of the underlying type system that has refinement types {v : A | p} and a subtyping relation Γ A <: B induced by logical implication. The two type systems share a common syntax for terms while types are more expressive in the refinement type system. We consider liftings of fibred adjunction models to interpret the refinement type system. Here, Theorem 12 can be used to obtain a lifting of fibred adjunction models via Eilenberg-Moore construction. We prove a soundness theorem that claims if a term is well-typed in the refinement type system, then the interpretation of the term has a lifting along the morphism of CCompCs defined in Theorem 5.

### **5.1 Underlying Type System**

We define the underlying dependent type system by a slightly modified version of a fragment of EMLTT [2–4]. We remove some of the types and terms from the original for simplicity. We parameterize our type system with a set of base type constructors (ranged over by b) and a set of value constants (ranged over by c) for convenience.

We define value types (A, B, . . .), computation types (C, D,...), contexts (Γ,. . .), value terms (V, W, . . .), and computation terms (M,N,...) as follows.

$$\begin{aligned} A &:= 1 \mid b\_A(V) \mid \Sigma x \text{:} A.B \mid U\underline{C} \mid A + B\\ \underline{C} &:= FA \mid \Pi x \text{:} A.\underline{C} &\qquad \Gamma := \diamondsuit \mid \Gamma, x : A\\ V &:= x \mid \ast \mid c\_A \mid \langle V, W \rangle\_{\langle x \cdot A \rangle, B} \mid \mathsf{thunk} \; M \mid \mathsf{ini}\_{A + B} \; V \mid \mathsf{inr}\_{A + B} \; V\\ M &:= \mathsf{return} \; V \mid M \; \mathbf{to} \; x : A \; \mathbf{in}\_{\underline{C}} \; N \mid \mathsf{force}\_{\underline{C}} \; V \mid \lambda x : A.M \; \mid M(V)\_{\langle x \cdot A \rangle, \underline{C}} \mid \mathsf{i}\\ \mathsf{pm} \; V \; \mathbf{as} \; \langle x : A, y : B \rangle \; \mathbf{in}\_{z \cdot \underline{C}} \; M \mid\\ \mathsf{case} \; V \; \mathsf{of}\_{z \cdot \underline{C}} \; (\mathsf{inl} \; (x : A) \mapsto M, \mathsf{inr} \; (y : B) \mapsto N) \end{aligned}$$

We implicitly assume that variables in Γ are mutually different. We use many type annotations in the syntax of terms for a technical reason, but we might omit them if they are clear from the context. We define substitution A[V /x], C[V /x], W[V /x], and M[V /x] as usual.

For each type constructor b, let arg(b) be a closed value type of the argument of b. We write b : A → Type if A = arg(b). For each value constant c, let ty(c) be a closed value type of c.

We have several kinds of judgements: well-formed contexts Γ; well-formed (value or computation) types Γ A, Γ C; well-typed (value or computation) terms Γ V : A, Γ M : C; and definitional equalities for contexts, types and terms Γ<sup>1</sup> = Γ2, Γ A = B, Γ C = D, Γ V = W : A, Γ M = N : C.

Typing rules are basically the same as EMLTT. Rules for base type constructors and value constants are shown in Fig. 2

$$\begin{array}{c c c} & & b:A \to \text{Type} \\ \hline I \vdash c\_{\text{ty}\{c\}} : \text{ty}\langle c \rangle & & \frac{\diamond \vdash A}{\color[rgb]{.5,.5,0.5}{.5}} & \frac{\color[rgb]{.5,.5,0.5} \vdash \text{ty}\langle c \rangle}{\color[rgb]{.5,.5,0.5} \vdash \text{ty}\langle c \rangle} \\ \frac{\color[rgb]{.5,.5,0.5} \vdash \text{ty}\langle c \rangle}{\color[rgb]{.5,.5,0.5} \vdash \text{ty}\langle c \rangle} & \frac{\color[rgb]{.5,.5,0.5} \vdash \text{ty}\langle c \rangle}{\color[rgb]{.5,.5,0.5} \vdash \text{ty}\langle c \rangle} \end{array}$$

**Fig. 2.** Some typing rules for the underlying type system.

*Semantics.* We use fibred adjunction models to interpret terms and types. We adapt the definition for our fragment of EMLTT as follows.

**Definition 15 (Fibred adjunction models).** A *fibred adjunction model* is a fibred adjunction <sup>F</sup> - <sup>U</sup> : <sup>r</sup> <sup>→</sup> <sup>p</sup> where <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> is a SCCompC with strong fibred coproducts and <sup>r</sup> : <sup>C</sup> <sup>→</sup> <sup>B</sup> is a fibration with <sup>p</sup>-products.

The Eilenberg-Moore fibration of a CCompC <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> inherits products in p [2, Theorem 4.3.24] and thus gives an example of fibred adjunction models.

**Lemma 16.** *Given a SCCompC* <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> *with strong fibred products and a split fibred monad* T *on* p*, then the Eilenberg-Moore adjunction of* T *is a fibred adjunction model.*

We assume that a fibred adjunction model <sup>F</sup> - <sup>U</sup> : <sup>r</sup> <sup>→</sup> <sup>p</sup> between <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> and <sup>r</sup> : <sup>C</sup> <sup>→</sup> <sup>B</sup> is given and that interpretations of base type constructors <sup>b</sup> <sup>∈</sup> <sup>E</sup> and value constants <sup>c</sup> <sup>∈</sup> <sup>E</sup>1(1, X) (for some <sup>X</sup> <sup>∈</sup> <sup>E</sup>1) are given. We define a partial interpretation -<sup>−</sup> of the following form for raw syntax.

$$\mathop{\mathbb{E}}\limits\_{\begin{subarray}{c}\mathsf{T}\in\mathsf{T}\\\mathsf{p}\end{subarray}}\mathop{\mathbb{E}}\limits\_{\begin{subarray}{c}\mathsf{T}\in\mathsf{T}\\\mathsf{p}\end{subarray}}\mathop{\mathbb{C}}\limits\_{\begin{subarray}{c}\mathsf{T}\in\mathsf{E}\\\mathsf{p}\end{subarray}}\left[\Gamma\right]\in\mathbb{B}\qquad[\varGamma;A]\in\mathbb{E}\_{\left[\mathsf{I}\right]}\qquad[\varGamma;\underline{\mathsf{C}}]\in\mathbb{C}\_{\left[\mathsf{I}\right]}\\\quad\qquad\qquad\qquad[\varGamma;M]\in\mathbb{E}\_{\left[\mathsf{I}\right]}\left(\mathbbm{1}\left[\mathsf{I}\right],A\right)\qquad\qquad\text{for some }A\\\quad\quad\quad\quad\quad\quad\quad\quad\quad\text{for some }C\in\mathbb{C}$$

Most of the definition of -<sup>−</sup> are the same as [2]. For base type constructors <sup>b</sup> and value constants <sup>c</sup>, we define -<sup>−</sup> as follows.

$$\mathbb{I}\left[\Gamma; b\_A(V)\right] = (s\llbracket\Gamma; V\rrbracket)^\* \{\overline{\mathop{!\!\!\!\! \_{\vert}}} (\llbracket\diamondsuit A\rrbracket\rangle)^\* \llbracket\mathbb{J}\rrbracket \qquad \left[\Gamma; c\_A\right] = \mathbb{I}^\*\_{\lbrack\Gamma\rrbracket\rangle} \llbracketc\rrbracket\Vert$$

Here, left-hand sides are defined if right-hand sides are defined.

**Proposition 17 (Soundness).** *Assume that* <sup>b</sup> <sup>∈</sup> <sup>E</sup>{-;A} *holds for each* b : <sup>A</sup> <sup>→</sup> Type *such that* -<sup>2</sup>; <sup>A</sup> *is defined, and* <sup>c</sup> <sup>∈</sup> <sup>E</sup>1(1, -<sup>2</sup>; ty(c)) *holds if* -<sup>2</sup>; ty(c) <sup>∈</sup> <sup>E</sup><sup>1</sup> *is defined. Interpretations* -<sup>−</sup> *of well-formed contexts and types and welltyped terms are defined. If two contexts, types, or terms are definitionally equal, then their interpretations are equal.*

#### **5.2 Predicate Logic**

We define syntax for logical formulas by

$$\vdash p = \top \mid p \land q \mid p \Rightarrow q \mid \forall x : A.p \mid V =\_A W \mid a(V)$$

$$\frac{\Gamma \vdash V : A \qquad \Gamma \vdash W : A}{\Gamma \vdash V =\_A W : \text{Prop}} \qquad \quad \frac{a : A \to \text{Prop} \qquad \diamond \vdash A \qquad \Gamma \vdash V : A}{\Gamma \vdash a(V) : \text{Prop}}$$

**Fig. 3.** Some rules for well-formed predicates.

where a ranges over predicate symbols. Here, we added and V =<sup>A</sup> W for typing rule for the unique value of the unit type and variables of base types (i.e. for selfification [18]), respectively, which we describe later. However, there is a large amount of freedom to choose the syntax of logical formulas. The least requirement here is that logical formulas can be interpreted in a posetal fibration <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup>, and interpretations of logical formulas admit semantic weakening, substitution, and conversion in the sense of [2, Proposition 5.2.4, 5.2.6]. So, we can almost freely add or remove logical connectives and quantifiers as long as <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup> admits them.

We define a standard judgement of well-formedness for logical formulas. Some of the rules for well-formedness are shown in Fig. 3

Logical formulas are interpreted in the fibration <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup>. We assume that interpretation <sup>a</sup> <sup>∈</sup> <sup>P</sup>{-;A} for each predicate symbol a : A → Prop is given. The interpretation -<sup>Γ</sup> <sup>p</sup> <sup>∈</sup> <sup>P</sup>-<sup>Γ</sup> is standard and defined inductively for each well-formed formulas. For example:

$$\begin{aligned} \left[ \var[\varGamma \vdash V =\_A W \big] \right] &= (s \big[\varGamma; V \big] \big)^{\*} (s (\pi^{\*}\_{\{\varGamma \vdash A\}} \big[\varGamma; W \big] \big))^{\*} \text{Eq} (\vartop \{\varGamma; A\} \big]) \\ \big[ \varGamma \vdash a (V) \big] &= s (\big[\varGamma; V \big] \big)^{\*} \{ \overline{\big[\varGamma \big]} (\big{] \diamond A \big] \big} )^{\*} \big[ a \big] \end{aligned}$$

where a : A → Prop is a predicate symbol and s is the bijection defined in §2.

#### **5.3 Refinement Type System**

We refine the underlying type system by adding predicates to base types and the unit type. From now on, we use subscript A<sup>u</sup> for types in the underlying type system to distinguish them from types in the refinement type system.

$$\begin{aligned} A &:= \{ v : b\_{A\_u}(V) \mid p \} \mid \{ v : 1 \mid p \} \mid \Sigma\\ \underline{C} &:= FA \mid \Pi x : A. \underline{C} \end{aligned} \qquad \begin{aligned} \underline{\qquad} &: A. B \mid U \underline{C} \mid A + B\\ \underline{\qquad} &:= \diamond \mid \Gamma, x : A \end{aligned}$$

We use the same definition of terms as the underlying type system and the same set of base type constructors and value constants. Argument types of base type constructors b : A<sup>u</sup> → Type are also the same, but types ty(c) assigned to value constants c are redefined as refinement types. Given a type A (or C) in the refinement type system, we define its underlying type |A| (or |C|) by induction where predicates are eliminated in the base cases.

$$|\{v: b\_{A\_u}(V) \mid p\}| = b\_{A\_u}(V) \qquad |\{v: 1 \mid p\}| = 1$$

Underlying contexts |Γ| are also defined by |2| = 2 and |Γ, x : A| = |Γ|, x : |A|.

$$\begin{array}{c} \begin{array}{l} b:A\_{u}\rightarrow\text{Type} & \vdash F \qquad | \,\Gamma\vdash b\_{Au}(\,\langle V\rangle \\ \hline | \,\Gamma\rangle,v:b\_{Au}(\,\langle V\rangle\vdash p:\text{Prop} \\ \hline \Gamma\vdash\{v:b\_{Au}(\,\langle V\rangle\mid p\} \end{array} & \begin{array}{l} \vdash F \qquad | \,\Gamma\vdash b\_{Au}(\,\langle V\rangle\vdash b\_{Au}(\,\langle V\rangle\vdash q) \\ \hline \Gamma\vdash\{v:b\_{Au}(\,\langle V\rangle\mid p\} \end{array} & \begin{array}{l} \vdash F\vdash| \,\Gamma\vdash b\_{Au}(\,\langle V\rangle\vdash b\_{Au}(\,\langle V\rangle\vdash q) \\ \hline \Gamma\vdash\{v:b\_{Au}(\,\langle V\rangle\mid p\} \;\,\langle\,\,\underline{\vdash}\,v:b\_{Au}(\,\langle V\rangle\mid q) \end{array} \\\\ \begin{array}{l} \vdash\Gamma\vdash A\_{1},x:\{v:b\_{Au}(\,\langle V\rangle\mid p\},\,\Gamma\_{2}\vdash x:\{v:b\_{Au}(\,\langle V\rangle\mid v\rangle = x\} \end{array} & \begin{array}{l} \vdash F\qquad \phi\vdash\text{ty}(\,\langle v\rangle\,\langle v\rangle\;\,\langle\,\,\underline{\vdash}\,w\rangle) \\ \hline \Gamma\vdash c\_{\{\text{ty}(\,\langle v\rangle\mid\,\,\,\forall\,\langle\,\,\underline{\vdash}\,w\rangle\} \end{array} \\\\ \begin{array}{l} \vdash\Gamma\vdash A\_{2}\wedge A\_{1} \\ \hline \Gamma\vdash\Pi x:A\_{1}\\_{\underline{C}\_{1}}\mathrel{\displaystyle{\scriptstyle{\scriptstyle{\scriptstyle$$

**Fig. 4.** Some typing rules for the refinement type system.

Judgements in the refinement type system are as follows. We have judgements for well-formedness or well-typedness for contexts, types and terms in the refinement type system, which are denoted in the same way as the underlying type system. We do not consider definitional equalities for terms because they are the same as the underlying type system. Instead, we add judgements for subtyping between types and contexts. They are denoted by Γ<sup>1</sup> <: Γ<sup>2</sup> for context, Γ A <: B for value types, and Γ C <: D for computation types.

Most of term and type formation rules are similar to the underlying type system. We listed some of the non-trivial modifications of typing rules in Fig. 4. We add typing rules for {v : b<sup>B</sup><sup>u</sup> (V ) | p} and {v : 1 | p}. Subtyping for these types are defined by judgements Γ; v : A<sup>u</sup> | p q for logical implication. Here, Γ; v : A<sup>u</sup> | p q means "assumptions in Γ and p implies q" where p and q are well-formed formulas in the context |Γ|, v : Au. We do not specify derivation rules for the judgement Γ; v : A<sup>u</sup> | p q but assume soundness of the judgement (explained later). We allow "selfification" [18] for variables of base types. Subtyping for Σx:A.B, UC, F A, and Πx:A.C are defined covariantly except the argument type A of Πx:A.C, which is contravariant. We have the rule of subsumption. Value constants are typed with a refined type assignment ty(c). The unique value ∗ of the unit type has type {v : 1 | }.

**Lemma 18.** *If we eliminate predicates in the refinement types from well-formed contexts, types and terms, then we get well-formed contexts, types and terms of the underlying type system.*

$$\begin{array}{l} \vdash & I \vdash \Gamma, \ then \vdash \mid \Gamma \vdash A, \ then \; \vert \Gamma \vdash A, \ then \; \vert \Gamma \vdash \vert A \rangle. \; If \; \Gamma \vdash \underline{C}, \ then \; \vert \Gamma \vdash \vert \underline{C} \vdash \vert \underline{C} \vdash \vert \Gamma \vdash \Gamma\_{1} \; \vert \; \vdash \Gamma\_{2}, \ then \; \vert \Gamma \vdash A \; \coloneqq \vert \; B, \ then \; \vert \Gamma \vdash \vert A \; \vert \vdash \vert B \; \vert. \; If \; \vdash \underline{C} \vdash \underline{C}, \ then \; \vert \Gamma \vdash \vert \underline{C} \vdash \vert \underline{D} \; \vert. \; \end{array}$$

*Proof.* By induction on the derivation of judgements. Each typing rule in the refinement type system has a corresponding rule in the underlying system.

**Example 19.** We can express conditional branching using the elimination rule of the fibred coproduct type 1 + 1. For example, assume we have a base type constructor int : 1 → Type for integers and a value constant for comparison.

$$(\leq): U(\Pi x \text{:int}. \Pi y \text{:int}. F(\{v: 1 \mid x \leq y\} + \{v: 1 \mid x > y\})))$$

We can define **if** x ≤ y **then** M **else** N to be a syntax sugar for

$$(x \le' y) \text{ to } z \text{ in } (\text{case } z \text{ of } (\text{inl } v \mapsto M, \text{inr } v \mapsto N))$$

where (≤ ) = **force** (≤). Note that M and N are typed in contexts that have v : {v : 1 | x ≤ y} or v : {v : 1 | x>y} depending on the result of comparison.

#### **5.4 Semantics**

**Definition 20 (lifting of fibred adjunction models).** Suppose that we have two fibred adjunction models <sup>F</sup> - <sup>U</sup> : <sup>q</sup> <sup>→</sup> <sup>p</sup> between <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> and <sup>q</sup> : <sup>C</sup> <sup>→</sup> <sup>B</sup> and <sup>F</sup>˙ - <sup>U</sup>˙ : <sup>s</sup> <sup>→</sup> <sup>r</sup> between <sup>r</sup> : <sup>U</sup> <sup>→</sup> <sup>P</sup> and <sup>s</sup> : <sup>D</sup> <sup>→</sup> <sup>P</sup>. The fibred adjunction model <sup>F</sup>˙ - <sup>U</sup>˙ is a *lifting* of <sup>F</sup> - <sup>U</sup> if there exists functors <sup>u</sup> : <sup>U</sup> <sup>→</sup> <sup>E</sup>, <sup>v</sup> : <sup>D</sup> <sup>→</sup> <sup>C</sup>, and <sup>t</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup> such that these functors strictly preserve all structures of <sup>F</sup>˙ - <sup>U</sup>˙ to those of F - U. That is, (u, t) : r → p and (v, t) : s → q are split fibred functors, the pair of fibred functor (u, t) and (v, t) is a map of adjunctions in the 2-category **Fib**, (u, t) strictly preserves the CCompC structure and fibred coproducts, and (v, t) maps r-products to p-products in the strict sense.

We assume that a lifting of fibred adjunction models is given as follows.

$$\underbrace{\mathbb{E}\overbrace{\underset{p}{\bigsqcup\limits\_{q}^{U}\mathbb{E}}^{F}}^{F}\mathbb{C}}\_{\mathbb{B}}\qquad\underbrace{\{\mathbb{E}\mid\mathbb{P}\}}\_{\{p\mid q\}}\underbrace{\underset{\bigvee\limits\_{q}^{U}\mathbb{E}}^{F}}\_{\mathbb{P}}\mathbb{D} \qquad\underbrace{\{\mathbb{E}\mid\mathbb{P}\}\xrightarrow{u}\mathbb{E}}\_{\mathbb{P}}\mathbb{E}\underset{q\geqslant\!p^{\{p\mid q\}}\mathbb{I}}{\mathbb{P}}\underbrace{\mathbb{D}\xrightarrow{v}\mathbb{E}}\_{\mathbb{P}}\mathbb{C}}\_{\mathbb{P}}\tag{4}$$

Here, we assume more than just a lifting of fibred adjunction models by requiring the specific SCCompC {p | q} with strong fibred coproducts, and the split functor (u, q) : {p | q} → p defined in Theorem 5 and Lemma 9. The underlying fibred adjunction model F - U is used for the underlying type system in §5.1, and <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup> is for predicate logic in §5.2. One way to obtain such liftings of fibred adjunction models is to apply the Eilenberg-Moore construction to the monad morphism in Theorem 12, but in general we do not restrict C and D to be Eilenberg-Moore categories. We further assume that q has p-equalities to interpret logical formulas of the form V =<sup>A</sup> W.

We define partial interpretation of refinement types -<sup>Γ</sup> <sup>∈</sup> <sup>P</sup>, -<sup>Γ</sup>; <sup>A</sup> <sup>∈</sup> {<sup>E</sup> <sup>|</sup> <sup>P</sup>}-<sup>Γ</sup>, and -<sup>Γ</sup>; <sup>C</sup> <sup>∈</sup> <sup>D</sup>-<sup>Γ</sup> similarly to the underlying type system but with the following modification. Here, we make use of the definition of {<sup>E</sup> <sup>|</sup> <sup>P</sup>}.

$$\begin{aligned} \{ \Gamma; \{v:b(V) \mid p\} \} &= \left( \{ \lbrack \Gamma \rvert; b(V) \rbrack, \lbrack \Gamma \rceil, \pi^\*\_{\lbrack \Gamma \rvert; b(V)} \lbrack \Gamma \rceil \land \lbrack \Gamma \rvert, v:b(V) \vdash p \rbrack \right) \\ &\quad \{ \Gamma; \{v:1 \mid p\} \} = \left( \{ \lbrack \Gamma \rvert; 1 \}, \lbrack \Gamma \rceil, \pi^\*\_{\lbrack \Gamma \rvert; 1 \rbrack} \lbrack \Gamma \rceil \land \lbrack \Gamma \rvert, v:1 \vdash p \} \right) \end{aligned}$$

For each (X, P, Q),(X , P , Q ) ∈ {<sup>E</sup> <sup>|</sup> <sup>P</sup>}, we define a semantic subtyping relation (X, P, Q) <: (X , P , Q ) by the conjunction of X = X , P = P , and Q ≤ Q . In other words, we have (X, P, Q) <: (X , P , Q ) if and only if there exists a morphism (idX, id<sup>P</sup> , h):(X, P, Q) → (X , P , Q ) that is mapped to identities by <sup>u</sup> : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>E</sup> and {<sup>p</sup> <sup>|</sup> <sup>q</sup>} : {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup>.

**Lemma 21. –** *If* -<sup>Γ</sup> *is defined, then* -<sup>|</sup>Γ| *is defined and equal to* <sup>q</sup>-Γ*.* **–** *If* -<sup>Γ</sup>; <sup>A</sup> *is defined, then* -<sup>|</sup>Γ|; <sup>|</sup>A| *is defined and equal to* <sup>u</sup>-<sup>Γ</sup>; <sup>A</sup>*.* **–** *If* -<sup>Γ</sup>; <sup>C</sup> *is defined, then* -<sup>|</sup>Γ|; <sup>|</sup>C| *is defined and equal to* <sup>v</sup>-<sup>Γ</sup>; <sup>C</sup>*.*

*Proof.* By simultaneous induction. The case of {v : A<sup>u</sup> | p} is obvious, and other cases follow from the definition of liftings of fibred adjunction models.

We do not specify syntactic derivation rules for judgement for logical implication Γ; v : A<sup>u</sup> | p q. Instead, we assume soundness of Γ; v : A<sup>u</sup> | p q in the following sense: π<sup>∗</sup> -|Γ|;Au-<sup>Γ</sup> <sup>∧</sup> -<sup>|</sup>Γ|, v : <sup>A</sup><sup>u</sup> <sup>p</sup> <sup>≤</sup> -<sup>|</sup>Γ|, v : <sup>A</sup><sup>u</sup> <sup>q</sup> holds in P-<sup>|</sup>Γ|,v:Au. For example, we can define a derivation rule for logical implication Γ; v : A<sup>u</sup> | p q from derivation rules for predicate logic Γ<sup>u</sup> | p q ("p implies q in the context Γu"). This is done by collecting predicates in context Γ by

$$\{\diamond\} = \top \qquad \qquad \{\varGamma, x:A\} = \begin{cases} \{\varGamma\} \land p[x/v] & \text{if } A = \{v: A\_u \mid p\} \\ \{\varGamma\} & \text{otherwise} \end{cases}$$

and defining a derivation rule for judgement for logical implication Γ; v : A<sup>u</sup> | <sup>p</sup> <sup>q</sup> by <sup>|</sup>Γ|, v : <sup>A</sup><sup>u</sup> <sup>|</sup> Γ <sup>∧</sup> <sup>p</sup> <sup>q</sup>. If the derivation rules for predicate logic <sup>Γ</sup><sup>u</sup> <sup>|</sup> <sup>p</sup> <sup>q</sup> is sound (i.e., <sup>Γ</sup><sup>u</sup> <sup>|</sup> <sup>p</sup> <sup>q</sup> implies -<sup>Γ</sup><sup>u</sup> <sup>p</sup> <sup>≤</sup> -<sup>Γ</sup><sup>u</sup> <sup>q</sup>), then so are the derivation rule for Γ; v : A<sup>u</sup> | p q. This technique is used in, e.g., [27].

**Theorem 22 (Soundness).** *Assume that* Γ; v : A<sup>u</sup> | p q *is sound in the sense described above,* <sup>b</sup> <sup>∈</sup> <sup>E</sup>{-;A} *holds for each* <sup>b</sup> : <sup>A</sup> <sup>→</sup> Type *if* -<sup>2</sup>; <sup>A</sup> *is defined, and* <sup>c</sup> ∈ {<sup>E</sup> <sup>|</sup> <sup>P</sup>}1(1, -<sup>2</sup>; ty(c)) *holds if* -<sup>2</sup>; ty(c) ∈ {<sup>E</sup> <sup>|</sup> <sup>P</sup>}<sup>1</sup> *is defined. Then we have the following.*


Since we have the bijection <sup>s</sup> : {<sup>E</sup> <sup>|</sup> <sup>P</sup>}<sup>P</sup> (1P,(X, P, Q)) → {<sup>f</sup> : <sup>P</sup> <sup>→</sup> <sup>Q</sup> <sup>|</sup> <sup>π</sup>(X,P,Q) ◦ <sup>f</sup> = id<sup>P</sup> } for each (X, P, Q) ∈ {<sup>E</sup> <sup>|</sup> <sup>P</sup>}, we obtain liftings of interpretations of terms along <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup>.

**Corollary 23.** *If* <sup>Γ</sup> <sup>V</sup> : <sup>A</sup>*, then* <sup>s</sup>-<sup>|</sup>Γ|; <sup>V</sup> : -<sup>|</sup>Γ| → {-<sup>|</sup>Γ|; <sup>A</sup>} *has a lifting* s-<sup>Γ</sup>; <sup>V</sup> : -<sup>Γ</sup> → {-<sup>Γ</sup>; <sup>A</sup>} *along* <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup> *(and similarly for computation terms* Γ M : C*).*

**Corollary 24.** *Assume the lifting of fibred adjunction models is given by applying the Eilenberg-Moore construction to a lifting of monads in Theorem 12. If* <sup>Γ</sup> <sup>M</sup> : F A*, then* <sup>θ</sup> ◦ <sup>s</sup>-<sup>|</sup>Γ|; <sup>M</sup> : -<sup>|</sup>Γ| <sup>→</sup> <sup>T</sup>{-<sup>|</sup>Γ|; <sup>A</sup>} *has a lifting of type* -<sup>Γ</sup> <sup>→</sup> <sup>T</sup>˙ {-<sup>Γ</sup>; <sup>A</sup>} *along* <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup>*.*

### **6 Toward Recursion in Refinement Type Systems**

We consider how to deal with general recursion in dependent refinement type systems. In [4], Ahman used a specific model of the fibration **CFam**(**CPO**) → **CPO** of continuous families of ω-cpos to extend EMLTT with recursion. However, we need to identify the structure that characterizes recursion to lift recursion from the underlying type system to dependent refinement type systems. So, we consider a generalization of Conway operators [22] and prove the soundness of the underlying and the dependent refinement type system extended with typing rules for recursion. This extension enables us to reason about partial correctness of general recursion.

Unfortunately, we still do not know an example of liftings of Conway operators, although (1) **CFam**(**CPO**) → **CPO** does have a Conway operator and (2) the soundness of the refinement type system with recursion holds under the existence of a lifting of Conway operators. We leave this problem for future work.

#### **6.1 Conway Operators**

The notion of Conway operators for cartesian categories is defined in [22]. We adapt the definition for comprehension categories with unit. We allow partially defined Conway operators because we need those defined only on interpretations of computation types.

**Definition 25 (Conway operator for comprehension categories with unit).** Let <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> be a comprehension category with unit and <sup>K</sup> <sup>⊆</sup> <sup>E</sup> be a collection of objects. A *Conway operator* for the comprehension category with unit <sup>p</sup> defined on <sup>K</sup> is a family of mappings (−)‡ : <sup>E</sup><sup>I</sup> (X, X) <sup>→</sup> <sup>E</sup><sup>I</sup> (1I,X) for each <sup>X</sup> <sup>∈</sup> <sup>E</sup><sup>I</sup> <sup>∩</sup> <sup>K</sup> such that the following conditions are satisfied.

**(Naturality)** For each <sup>X</sup> <sup>∈</sup> <sup>K</sup>, <sup>f</sup> <sup>∈</sup> <sup>E</sup><sup>I</sup> (X, X), and <sup>u</sup> : <sup>J</sup> <sup>→</sup> <sup>I</sup>, <sup>u</sup>∗<sup>f</sup> ‡ = (u∗f)‡. **(Dinaturality)** For each X, Y <sup>∈</sup> <sup>K</sup>, <sup>f</sup> <sup>∈</sup> <sup>E</sup><sup>I</sup> (X, Y ), and <sup>g</sup> <sup>∈</sup> <sup>E</sup><sup>I</sup> (Y,X), (<sup>g</sup> ◦ f)‡ = g ◦ (f ◦ g)‡.

**(Diagonal property)** For each <sup>X</sup> <sup>∈</sup> <sup>K</sup> and <sup>f</sup> <sup>∈</sup> <sup>E</sup>{X}(π<sup>∗</sup> <sup>X</sup>X, π<sup>∗</sup> <sup>X</sup>X), if π<sup>∗</sup> <sup>X</sup>X ∈ K, then (φ(f ‡))‡ = (φ(δ<sup>∗</sup> <sup>X</sup>(φ−<sup>1</sup>(f))))‡ holds where <sup>φ</sup> : <sup>E</sup>{X}(1{X}, π<sup>∗</sup> <sup>X</sup>X) → <sup>E</sup><sup>I</sup> (X, X) is the isomorphism defined in §2.

**Lemma 26.** *Let* B *be a cartesian category. There is a bijective correspondence between the following. (1) Conway operators* (−)† *on the cartesian category* <sup>B</sup>*. (2) Conway operators* (−)‡ *on the simple comprehension category* **<sup>s</sup>**(B) <sup>→</sup> <sup>B</sup><sup>→</sup> *that are defined totally on* **<sup>s</sup>**(B)*.*

**Example 27.** Let K ⊆ **CFam**(**CPO**) be a collection of objects defined by K = {(I,X) ∈ **CFam**(**CPO**) | for each i ∈ I, Xi has a least element}. For each (I,X) ∈ K and vertical morphism f = (id<sup>I</sup> ,(fi)<sup>i</sup>∈<sup>I</sup> ):(I,X) → (I,X), we define f ‡ = (id<sup>I</sup> ,(∗ 
→ lfpfi)<sup>i</sup>∈<sup>I</sup> ):(I, 1) → (I,X). Then (−)‡ is a Conway operator, which is implicitly used in [4].

$$\begin{array}{c} \begin{array}{l} \Gamma \vdash \underline{\underline{C}} \qquad \Gamma, x:U\underline{C} \vdash M:\underline{\underline{C}}\\ \Gamma \vdash \mu x:U\underline{C}.M:\underline{\underline{C}} \end{array} \qquad \qquad \begin{array}{l} \Gamma \vdash \underline{\underline{C}} = \underline{D} \qquad \Gamma, x:U\underline{C} \vdash M = N:\underline{\underline{C}}\\ \Gamma \vdash \mu x:U\underline{C}.M:\underline{\underline{C}} \end{array} \\ \hline \qquad \begin{array}{l} \Gamma \vdash \underline{\underline{C}} \qquad \Gamma, x:U\underline{C}.M:\underline{\underline{C}}\\ \Gamma \vdash M[\mathtt{tanh}\,\mathsf{k}\,(\mu x:U\underline{C}.M)/x] \end{array} \qquad \qquad \begin{array}{l} \Gamma \vdash \underline{C} \qquad \Gamma, x:U\underline{C}.M:\underline{\underline{C}}\\ \Gamma \vdash \mu x:U\underline{C}.M:\underline{\underline{C}}\\ \Gamma \vdash \mu x:U\underline{C}.M:\underline{\underline{C}}.M \end{array} \\ \hline \\ \begin{array}{l} \Gamma \vdash \underline{\underline{C}} \qquad \Gamma, x:U\underline{C}.M:\underline{\underline{C}} \qquad M:\underline{\underline{C}}\\ \Gamma \vdash \mu x:U\underline{C}.M[x/y]:\underline{C} \end{array} \end{array}$$

**Fig. 5.** Typing rules for general recursion.

#### **6.2 Recursion in the Underlying Type System**

*Syntax.* We add recursion μx : UC.M to the syntax of computation terms. We also add typing rules in Fig. 5.

*Semantics.* Assume we have a fibred adjunction model F - U : r → p where <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> and <sup>r</sup> : <sup>C</sup> <sup>→</sup> <sup>B</sup>. We need a Conway operator defined on objects in {-<sup>Γ</sup>;UC <sup>|</sup> <sup>Γ</sup> <sup>C</sup>} ⊆ <sup>E</sup>. However, here is a circular definition because -<sup>Γ</sup>;UC may contain terms of the form μx : UD.M, whose interpretations are defined by the Conway operator. So, we use a slightly stronger condition.

**Definition 28.** A *Conway operator defined on computation types* is a Conway operator defined on <sup>K</sup> <sup>⊆</sup> <sup>E</sup> such that <sup>K</sup> satisfies the following conditions. (1) UFX <sup>∈</sup> <sup>K</sup> holds for each <sup>X</sup> <sup>∈</sup> <sup>E</sup>. (2) <sup>X</sup> <sup>Y</sup> <sup>∈</sup> <sup>K</sup> holds for each <sup>X</sup> <sup>∈</sup> <sup>E</sup> and <sup>Y</sup> <sup>∈</sup> <sup>K</sup> <sup>∩</sup> <sup>E</sup>{X}. (3) For each <sup>X</sup> <sup>∈</sup> <sup>K</sup> and <sup>Y</sup> <sup>∈</sup> <sup>E</sup>, <sup>X</sup> <sup>∼</sup><sup>=</sup> <sup>Y</sup> implies <sup>Y</sup> <sup>∈</sup> <sup>K</sup>.

Given a Conway operator defined on computation types, we interpret μx : UC.M by -<sup>Γ</sup>; μx : UC.M = (φ(-Γ, x : UC; <sup>M</sup>))‡ : 1-<sup>Γ</sup> <sup>→</sup> <sup>U</sup>-<sup>Γ</sup>; <sup>C</sup>.

**Proposition 29.** *Soundness (Proposition 17) holds for the underlying type system extended with general recursion.*

*Proof.* By induction. We can prove that the given Conway operator is defined on {-<sup>Γ</sup>;UC <sup>|</sup> <sup>Γ</sup> <sup>C</sup>} ⊆ <sup>E</sup> by [2, Proposition 4.1.14].

#### **6.3 Recursion in Refinement Type System**

*Syntax.* We add the typing rule for Γ μx:UC.M : C in Fig. 5 to the refinement type system. Here, recall that we remove definitional equalities when we consider the refinement type system.

*Semantics.* We consider liftings of Conway operators to interpret recursion in the refinement type system.

**Definition 30.** Let <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> and <sup>q</sup> : <sup>D</sup> <sup>→</sup> <sup>A</sup> be comprehension categories with unit, (u, v) : p → q be a morphism of comprehension categories with unit. Assume <sup>q</sup> has a Conway operator (−)‡ defined on <sup>K</sup> <sup>⊆</sup> <sup>D</sup>. A *lifting* of the Conway operator (−)‡ along (u, v) is a Conway operator (−) for <sup>p</sup> defined on <sup>L</sup> <sup>⊆</sup> <sup>E</sup> such that uL <sup>⊆</sup> <sup>K</sup> and <sup>u</sup>(f)=(uf)‡ for each <sup>f</sup> <sup>∈</sup> <sup>E</sup><sup>I</sup> (X, X) where <sup>X</sup> <sup>∈</sup> <sup>L</sup>.

**Lemma 31.** *Let* (u, v) *be a morphism of CCompCs defined in Theorem 5. Assume* <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> *has a Conway operator* (−)‡ *defined on* <sup>K</sup> <sup>⊆</sup> <sup>E</sup>*. The CCompC* {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup> *has a lifting of the Conway operator defined on* <sup>L</sup> ⊆ {<sup>E</sup> <sup>|</sup> <sup>P</sup>} *if* uL <sup>⊆</sup> <sup>K</sup> *and for each* (X, P, Q) <sup>∈</sup> <sup>L</sup> *and* <sup>f</sup> ∈ {<sup>E</sup> <sup>|</sup> <sup>P</sup>}<sup>P</sup> ((X, P, Q),(X, P, Q))*,* {f ‡} *has a lifting* π<sup>∗</sup> <sup>1</sup>pX<sup>P</sup> <sup>→</sup> <sup>Q</sup> *along* <sup>q</sup> : <sup>P</sup> <sup>→</sup> <sup>B</sup>*.*

*Proof.* Let (f, id<sup>P</sup> , h):(X, P, Q) <sup>→</sup> (X, P, Q) be a morphism in {<sup>E</sup> <sup>|</sup> <sup>P</sup>} where (X, P, Q) <sup>∈</sup> <sup>L</sup>. We define a Conway operator by (f, id<sup>P</sup> , h) = (<sup>f</sup> ‡, id<sup>P</sup> , h ) : (1pX, P, π<sup>∗</sup> <sup>1</sup>pXP) → (X, P, Q) where h is a lifting of {f ‡}.

We assume that a lifting of fibred adjunction models (4) together with a lifting of Conway operators defined on computation types is given.

**Theorem 32.** *Soundness (Theorem 22) holds for the refinement type system extended with general recursion.*

Consider the fibration **CFam**(**CPO**) → **CPO** for the underlying type system with recursion. To support recursion in our refinement type system, a natural choice of a fibration for predicate logic is the fibration of admissible subsets **Adm**(**CPO**) → **CPO** because the least fixed point of an ω-continuous function f : X → X is given by lfpf = <sup>n</sup> <sup>f</sup> <sup>n</sup>(⊥). However, we cannot apply Theorem <sup>5</sup> because **Adm**(**CPO**) → **CPO** is not a fibred ccc [9, §4.3.2]. Specifically, it is not clear whether this combination admits products. We believe that our approach is quite natural but leave giving concrete examples of liftings of Conway operators for future work.

### **7 Related Work**

*Dependent refinement types.* Historically, there are two kinds of refinement types. One is *datasort refinement types* [7], which are subsets of underlying types but not necessarily dependent. The other is *index refinement types* [28]. A typical example of index refinement types is a type of lists indexed by natural numbers that represent the length of lists. Nowadays, the word "refinement types" includes datasort and index refinement types, and moreover, mixtures of them.

Among a wide variety of the meaning of refinement types, we focus on types equipped with predicates that may depend on other terms [6, 20], which we call *dependent refinement types* or just *refinement types*. Dependent refinement types are widely studied [5, 13, 14, 25], and implemented in, e.g., F [23, 24] and LiquidHaskell [19,26,27]. However, most studies focus on decidable type systems, and only a few consider categorical semantics.

We expect that some of the existing refinement type systems are combined with effect systems. For example, a dependent refinement type system for nondeterminism and partial/total correctness proposed in [25] contains types for computations indexed by quantifiers Q1Q<sup>2</sup> where Q1, Q<sup>2</sup> ∈ {∀, ∃}. Here, Q<sup>1</sup> represents may/must nondeterminism, and Q<sup>2</sup> represents total/partial correctness. It has been shown that Q1Q<sup>2</sup> corresponds to four cartesian liftings of the monad P+((−) + 1) [1, 12]. We conjecture that these liftings are connected by monad morphisms and hence yield a lattice-graded monad. Another example is a relational refinement type system for differential privacy [5]. Their system seems to use a graded lifting of the distribution monad where the lifting is graded by privacy parameters, as pointed out in [21]. We leave for future work combining our refinement type system with effect systems based on graded monads [8, 11, 15].

*Categorical semantics.* Our interpretation of refinement type systems is based on a morphism of CCompCs, which is a similar strategy to [16]. The difference is that our paper focuses on dependent refinement types and makes the role of predicate logic explicit by giving a semantic construction of refinement type systems from given underlying type systems and predicate logic.

Combining dependent types and computational effects is discussed in [2–4]. Although their aim is not at refinement types, their system is a basis for the design and semantics of our refinement type system with computational effects.

Semantics for types of the form {v : A<sup>u</sup> | p} are characterized categorically as right adjoints of terminal object functors in [10, Chapter 11]. Such types are called *subset types* there. They consider the situation where a given CCompC <sup>p</sup> : <sup>E</sup> <sup>→</sup> <sup>B</sup> is already rich enough to interpret {<sup>v</sup> : <sup>A</sup><sup>u</sup> <sup>|</sup> <sup>p</sup>}, and do not aim to interpret refinement type systems by liftings of CCompCs. Moreover, we cannot directly use the interpretations in [10] for our CCompC {<sup>E</sup> <sup>|</sup> <sup>P</sup>} → <sup>P</sup> because we are not given a fibration for predicate logic whose base category is P.

#### **8 Conclusion and Future Work**

We provided a general construction of liftings of CCompCs from combinations of CCompCs and posetal fibrations satisfying certain conditions. This can be seen as a semantic counterpart of constructing dependent refinement type systems from underlying type systems and predicate logic. We identified sufficient conditions for several structures in underlying type systems (e.g. products, coproducts, fibred coproducts, fibred monads, and Conway operators) to lift to dependent refinement type systems. We proved the soundness of a dependent refinement type system with computational effects with respect to interpretations in CCompCs obtained from the general construction.

We aim to extend our dependent refinement type system by combining effect systems based on graded monads [8, 11, 15]. We hope that this extension will give us a more expressive framework that subsumes, for example, dependent refinement type systems in [5,25]. Another direction is to define interpretations of {v : A<sup>u</sup> | p} in the style of subset types in [10, Chapter 11]. Lastly, we are interested in finding more examples of possible combinations of underlying type systems and predicate logic (especially for recursion in dependent refinement type systems but not limited to this) so that we can find a new practical application of this paper.

*Acknowledgement.* We thank Shin-ya Katsumata, Hiroshi Unno and the anonymous referees for helpful comments. This work was supported by JST ERATO HASUO Metamathematics for Systems Design Project (No. JPMJER1603).

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **Simple Stochastic Games with Almost-Sure Energy-Parity Objectives are in NP and coNP**

Richard Mayr<sup>1</sup>, Sven Schewe<sup>2</sup> , Patrick Totzke<sup>2</sup> , and Dominik Wojtczak2(-)

<sup>1</sup> University of Edinburgh, Edinburgh, UK <sup>2</sup> University of Liverpool, Liverpool, UK {sven.schewe,totzke,d.wojtczak}@liverpool.ac.uk

**Abstract.** We study stochastic games with energy-parity objectives, which combine quantitative rewards with a qualitative ω-regular condition: The maximizer aims to avoid running out of energy while simultaneously satisfying a parity condition. We show that the corresponding almost-sure problem, i.e., checking whether there exists a maximizer strategy that achieves the energy-parity objective with probability 1 when starting at a given energy level k, is decidable and in NP ∩ coNP. The same holds for checking if such a k exists and if a given k is minimal.

**Keywords:** Simple Stochastic Games, Parity Games, Energy Games

#### **1 Introduction**

*Simple stochastic games* (SSGs), also called *competitive Markov decision processes* [30], or 2 <sup>1</sup> <sup>2</sup> *-player games* [23,22] are turn-based games of perfect information played on finite graphs. Each state is either random or belongs to one of the players (maximizer or minimizer). A game is played successively moving a pebble along the game graph, where the next state is chosen by the player who owns the current one or, in the case of random states, according to a predefined distribution. This way, an infinite run is produced. The maximizer tries to achieve an objective (in our case almost surely), while the minimizer tries to prevent this. The maximizer can be seen as a controller trying to ensure an objective in the face of both known random failure modes (encoded by the random states) and an unknown or hostile environment (encoded by the minimizer player).

Stochastic games were first introduced in Shapley's seminal work [46] in 1953 and have since then played a central role in the solution of many problems in computer science, including synthesis of reactive systems [45,42]; checking interface compatibility [27]; well-formedness of specifications [28]; verification of open systems [4]; and many others.

A huge variety of objectives for such games was already studied in the literature. We will mainly focus on three of them in this paper: parity; meanpayoff; and energy objectives. In order to define them we assume that numeric rewards are assigned to transitions, and priorities (encoded by bounded nonnegative numbers) are assigned to states.

The *parity objective* simply asks that the minimal priority that appears infinitely often in a run is even. Such a condition is a canonical way to define desired behaviors of systems, such as safety, liveness, fairness, etc.; it subsumes all ω-regular objectives. The algorithmic problem of deciding the winner in nonstochastic parity games is polynomial-time equivalent to the model checking of the modal μ-calculus [49] and is at the center of the algorithmic solutions to the Church's synthesis problem [44]. But the impact of parity games goes well beyond automata theory and logic: They facilitated the solution of two long-standing open problems in stochastic planning [29] and in linear programming [32], which was done by careful adaptation of the parity game examples on which the strategy improvement algorithm [31] requires exponentially many iterations.

The parity objective can be seen as a special case of the *mean-payoff objective* that asks for the limit average reward per transition along the run to be non-negative. Mean-payoff objectives are among the first objectives studied for stochastic games and go back to a 1957 paper by Gillette [33]. They allow for reasoning about the efficiency of a system, e.g., how fast it operates once optimally controlled.

The *energy objective* [14] can be seen as a refinement of the mean-payoff objective. It asks for the accumulated reward at any point of a run not to be lower than some finite threshold. As the name suggests, it is useful when reasoning about systems with a finite initial energy level that should never become depleted. Note that the accumulated reward is not bounded a-priori, which essentially turns a finite-state game into an infinitely-state one.

In this paper we consider SSGs with *energy-parity* objectives, which requires runs to satisfy both an energy and a parity objective. It is natural to consider such an objective for systems that should not only be correct, but also energy efficient. For instance, consider a robot maintaining a nuclear power plant. We not only require the robot to correctly react to all possible chains of events (parity objective for functional correctness), but also never to run out of energy as charging it manually would be risky (energy objective).

While the complexity of games with single objectives is often in NP ∩ coNP, asking for multiple objectives often makes solving games harder. Parity games are commonly viewed as the simplest of these objectives, and some traditional solutions for non-stochastic games go through simple reductions to mean-payoff or energy conditions (which are quite similar in non-stochastic games) to discounted payoff games that establishes the membership of those problems in UP and coUP [35]. However, asking for *two* parity objectives to be satisfied at the same time leads to coNP completeness [21].

We study the almost sure satisfaction of the energy-parity objective, i.e., with probability 1. Such *qualitative analysis* is important as there are many applications where we need to know whether the correct behavior arises almostsurely, e.g., in the analysis of randomized distributed algorithms (see, e.g, [43,47]) and safety-critical examples like the one from above. Moreover, the algorithms for *quantitative analysis*, i.e., computing the optimal probability of satisfaction, typically start by performing the qualitative analysis first and then solving a

game with a simpler objective (see, e.g., [23,15]). Finally, there are stochastic models for which qualitative analysis is decidable but quantitative one is not (e.g., probabilistic finite automata [6]). This may also be the case for our model.

**Our contributions.** We consider stochastic games with energy-parity winning conditions and show that deciding whether maximizer can win almost-surely for a given initial energy level k is in NP ∩ coNP. We show the same for checking if such k exists at all and checking if a given k is the smallest possible for which this holds. The proofs are considerably harder than the corresponding result for MDPs [40] (on which they are partly based), because the attainable mean-payoff value is no longer a valid criterion in the analysis (via combinations of sub-objectives). E.g., even though the stored energy might be inexorably drifting towards +∞ (resp. −∞), the mean-payoff value might still be zero because the minimizer (resp. maximizer) can delay payoffs for longer and longer (though not indefinitely, due to the parity condition). Moreover, the minimizer might be able to choose between different ways of losing and never commit to any particular way after any finite prefix of the play (see Example 1).

Our proof characterizes almost-sure energy-parity via a recursive combination of complex sub-objectives called *Gain* and *Bailout*, which can each eventually be solved in NP ∩ coNP.

Our proof of the coNP membership is based on a result on the strategy complexity of a natural class of objectives, which is of independent interest. We show (cf. Theorem 6; based on previous work in [34]) that, if an objective O is such that its complement is both shift-invariant and submixing, and that every MDP admits optimal finite-memory deterministic maximizer strategies for O, then the same is true in turn-based stochastic games.

*Example 1.* Fig. 1 shows an energy-parity game that the maximizer can win almost surely when starting with an energy level of ≥ 2 from the middle left node. Whenever the game is at that node with an energy level ≥ 3, then the maximizer can turn left and has at least <sup>1</sup> <sup>2</sup> chance that the energy level will never drop to 2 while wining the game with priority 2. This is because we can

Fig. 1: A SSG with two maximizer states (), one minimizer state (-) and one probabilistic state (). Each state is annotated with its priority. Each edge is annotated with a reward by which the energy level is increased after traversing it (respectively, decreased if the reward is negative). The maximizer wins if the lowest priority visited infinitely often is even and the energy level never drops below 0.

view this process as a random walk on a half line. If x<sup>n</sup> is the probability of reaching energy level 2 when starting at energy n then these probabilities are the least point-wise positive solution to the following system of linear equations: x<sup>2</sup> = 1, x<sup>n</sup> = <sup>2</sup> <sup>3</sup>x<sup>n</sup>+1 + <sup>1</sup> <sup>3</sup>x<sup>n</sup>−<sup>1</sup> for all <sup>n</sup> <sup>≥</sup> 3. We then get that <sup>x</sup><sup>n</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup>n−<sup>2</sup> so the probability of not reaching energy level 2 is <sup>≥</sup> <sup>1</sup> <sup>2</sup> for all n ≥ 3. Always turning left guarantees that, almost surely, the parity condition holds and the limes inferior of the energy level is not −∞. We call this condition *Gain*. Strategies for *Gain* can be used when the energy level is sufficiently high (at least 3 in our example) to win with a positive probability.

However, if maximizer plays for Gain and always moves left, then for every initial energy level the chance of eventually dropping the energy down to level 2 is positive, due to the negative cycle. When that happens, the only other option for the maximizer is to move right. There minimizer can 'choose how to lose', via a disjunction of two conditions that we later formalize as *Bailout*. Either minimizer goes back to the start state without changing the energy level (thus maximizer wins as the energy stays at level 2 and only the good priority 2 is seen), or minimizer turns right. In the latter case, the play visits a dominating odd priority (which is bad for maximizer) but also increases the energy by 1, which allows maximizer to switch back to playing left for the *Gain* condition until energy level 2 is reached again.

Our maximizer strategies are a complex interplay between *Bailout* and *Gain*. In the example, it is easy to see that the probability of seeing priority 1 infinitely often is zero if maximizer follows the just described strategy (the probability of requiring to go right more than n times is at most ( <sup>1</sup> <sup>2</sup> )<sup>n</sup>), so maximizer wins this energy-parity game almost surely. Note that maximizer does not win almost surely when the initial energy level is 0 or 1.

**Previous work on combined objectives.** Non-stochastic energy-parity games have been studied in [16]. They can be solved in NP ∩ coNP and maximizer strategies require only finite (but exponential) memory, a property that also allowed to show P-time inter-reducibility with mean-payoff parity games. More recently they were also shown to be solvable in pseudo-quasi-polynomial time [26]. Related results on non-stochastic games (e.g., mean-payoff parity) are summarized in [18].

Most existing work on combined objectives for stochastic systems, for example [17,18,9,40], is restricted to Markov decision processes (MDPs; aka 1 <sup>1</sup> <sup>2</sup> -player games). Almost-sure energy-parity objectives for MDPs were first considered in [17,18], where a direct reduction to ordinary energy games was proposed. This reduction relies on the assumption that maximizer can win using finite memory if at all. Unfortunately, this assumption does not necessarily hold: it was shown in [40] that an almost sure winning strategy for energy-parity in finite MDPs may require infinite memory. Nevertheless, it was possible to recover the original result, that deciding the existence of a.s. winning strategies is in NP ∩ coNP (and pseudo-polynomial time), by showing that the existence of an a.s. winning strategy can be witnessed by the existence of two compatible, and finite-memory, winning strategies for two simpler objectives. We generalize this approach from MDPs to full stochastic games.

Stochastic mean-payoff parity games were studied in [20], where it was shown that they can be solved in NP∩coNP. However, this does not imply a solution for stochastic energy-parity games, since, unlike in the non-stochastic case [16], there is no known reduction from energy-parity to mean-payoff parity in stochastic games. (The reduction in [16] relies on the fact that maximizer has a winning finitememory strategy for energy-parity, which does not generally hold for stochastic games or MDPs; see above.)

A related model are the 1-counter MDPs (and stochastic games) studied in [12,11,8], since the value of the counter can be interpreted as the stored energy. These papers consider the objective of reaching counter value zero (which is dual to the energy objective of staying above zero), thus the roles of minimizer and maximizer are swapped. However, unlike in this paper, these works do not combine termination objectives with extra parity conditions.

**Structure of the paper.** The rest of the paper is organized as follows. We start by introducing the notation and formal definitions of games and objectives in the next section. In Section 3 we show how checking almost-sure energy-parity objectives can be characterized in terms of two newly defined auxiliary objectives: Gain and Bailout. In Sections 4 and 5, we show that almost-sure Bailout and Gain objectives, respectively, can be checked in NP and coNP. Section 6 contains our main result: NP and coNP algorithms for checking almost-sure energy-parity games with a known and unknown initial energy, as well as checking if a given initial energy is the minimal one. We conclude and point out some open problems in Section 7. Due to page restrictions, most proofs in the main body of the paper were replaced by sketches. The detailed proofs can be found in the full version of this paper [41].

### **2 Preliminaries**

A probability distribution over a set X is a function f : X → [0, 1] such that <sup>x</sup>∈<sup>X</sup> <sup>f</sup>(x) = 1. We write <sup>D</sup>(X) for the set of distributions over <sup>X</sup>.

**Games, Strategies, Measures.** A *Simple Stochastic Game (SSG)* is a directed graph <sup>G</sup> def = (V,E,λ), where all states have an outgoing edge and the set of states is partitioned into states owned by *maximizer* (V-), *minimizer* (V) and probabilistic states (V). The set of *edges* is <sup>E</sup> <sup>⊆</sup> <sup>V</sup> <sup>×</sup> <sup>V</sup> and <sup>λ</sup> : <sup>V</sup> → D(E) assigns each probabilistic state a probability distribution over its outgoing edges. W.l.o.g., we assume that each probabilistic state has at most two successors, because one can introduce a new probabilistic state for each excess successor. We let λ(ws) def <sup>=</sup> <sup>λ</sup>(s) for all ws <sup>∈</sup> (V E)∗V.

A *path* is a finite or infinite sequence ρ def = s0e0s1e<sup>1</sup> ... such that e<sup>i</sup> = (si, s<sup>i</sup>+1) ∈ E holds for all indices i. A *run* is an infinite path and we write *Runs* def = (V E)<sup>ω</sup> for the set of all runs.

A *strategy* for maximizer is a function <sup>σ</sup> : (V E)∗V- → D(E) that assigns to each path ws <sup>∈</sup> (V E)∗V a probability distribution over the outgoing edges of its target node s. That is, σ(ws)(e) > 0 implies e = (s, t) ∈ E for some t ∈ V . A strategy is called *memoryless* if σ(xs) = σ(ys) for all x, y ∈ (V E)<sup>∗</sup> and <sup>s</sup> <sup>∈</sup> <sup>V</sup>-, *deterministic* if <sup>σ</sup>(w) is Dirac for all <sup>w</sup> <sup>∈</sup> (V E)∗V-, and *finite-state* if there exists an equivalence relation <sup>∼</sup> on (V E)∗V with a finite index, such that σ(ρ1) = σ(ρ2) if ρ<sup>1</sup> ∼ ρ2. Of particular interest to us will be the class of *memoryless deterministic strategies* (*MD*) and the class of *finite-memory deterministic strategies* (*FD*). Strategies for minimizer are defined analogously and will usually be denoted by <sup>τ</sup> : (V E)∗V → D(E).

A maximizing (minimizing) *Markov Decision Process (MDP)* is a game in which minimizer (maximizer) has no choices, i.e., all her states have exactly one successor. We will write G[τ ] for the MDP resulting from fixing the strategy τ . A *Markov chain* is a game where neither player has a choice. In particular, G[σ, τ ] is a Markov chain obtained by setting, in the game G, the strategies for maximizer and minimizer to σ and τ , respectively.

Given an initial state s ∈ V and strategies σ and τ for maximizer and minimizer, respectively, the set of runs starting in s naturally extends to a probability space as follows. We write *Runs*<sup>G</sup> <sup>w</sup> for the w*-cylinder*, i.e., the set of all runs with prefix w ∈ (V E)∗V . We let F<sup>G</sup> be the σ-algebra generated by all these cylinders. We inductively define a probability function PG,σ,τ <sup>s</sup> on all cylinders, which then uniquely extends to F<sup>G</sup> by Carath´eodory's extension theorem [5], by setting PG,σ,τ <sup>s</sup> (*Runs*<sup>G</sup> <sup>s</sup> ) def = 1 and PG,σ,τ <sup>s</sup> (*Runs*<sup>G</sup> <sup>w</sup>) def = <sup>n</sup>−<sup>1</sup> <sup>i</sup>=0 *dist* <sup>i</sup>(s0e0s1e<sup>1</sup> ...si)(ei) for w = s0e0s1e<sup>1</sup> ...e<sup>n</sup>−<sup>1</sup>sn, where s<sup>0</sup> = s, e<sup>i</sup> = (si, s<sup>i</sup>+1) and *dist* <sup>i</sup> is σ(·), τ (·) or <sup>λ</sup>(·), for <sup>s</sup><sup>i</sup> <sup>∈</sup> <sup>V</sup>-,V or <sup>V</sup>, respectively.

**Objective Functions.** A (Borel) *objective* is a set Obj ∈ F<sup>G</sup> of runs. We write Obj def = *Runs* \Obj for its complement. Borel objectives Obj are weakly determined [39,38], which means that

$$\sup\_{\sigma} \inf\_{\tau} \mathbb{P}\_s^{\sigma,\tau}(\mathbf{Obj}) = \inf\_{\tau} \sup\_{\sigma} \mathbb{P}\_s^{\sigma,\tau}(\mathbf{Obj}).$$

This quantity is called the *value* of Obj in state s, and written as Val<sup>G</sup> <sup>s</sup> (Obj). We say that Obj holds *almost-surely* (abbreviated as *a.s.*) at state s iff there exists <sup>σ</sup> such that <sup>∀</sup>τ, <sup>P</sup>G,σ,τ <sup>s</sup> (Obj) = 1. Let AS<sup>G</sup> (Obj) denote the set of states at which Obj holds almost surely. We will drop the superscript G and simply write *Runs*, Pσ,τ <sup>s</sup> and AS (Obj), if the game is clear from the context.

We use the syntax and semantics of operators F (eventually) and G (always) from the temporal logic LTL [25] to specify some conditions on runs.

A *reachability condition* is defined by a set of target states T ⊆ V . A run <sup>ρ</sup> <sup>=</sup> <sup>s</sup>0e0s<sup>1</sup> ... satisfies the reachability condition iff there exists an <sup>i</sup> <sup>∈</sup> <sup>N</sup> s.t. <sup>s</sup><sup>i</sup> <sup>∈</sup> <sup>T</sup>. We write <sup>F</sup><sup>T</sup> <sup>⊆</sup> *Runs* for the set of runs that satisfy this reachability condition. Given a set of states W ⊆ V , we lift this to a safety condition on runs and write <sup>G</sup><sup>W</sup> <sup>⊆</sup> *Runs* for the set of runs <sup>ρ</sup> <sup>=</sup> <sup>s</sup>0e0s<sup>1</sup> ... where <sup>∀</sup>i. s<sup>i</sup> <sup>∈</sup> <sup>W</sup>.

<sup>A</sup> *parity condition* is given by a bounded function *parity* : <sup>V</sup> <sup>→</sup> <sup>N</sup> that assigns a priority (a non-negative integer) to each state. A run ρ ∈ *Runs* satisfies the parity condition iff the minimal priority that appears infinitely often on the run is even. The *parity objective* is the subset PAR ⊆ *Runs* of runs that satisfy the parity condition.

*Energy conditions* are given by a function *<sup>r</sup>* : <sup>E</sup> <sup>→</sup> <sup>Z</sup>, that assigns a *reward* value to each edge. For a given initial energy value <sup>k</sup> <sup>∈</sup> <sup>N</sup>, a run <sup>s</sup>0e0s1e<sup>1</sup> ... satisfies the k-energy condition if, for every finite prefix of length n, the *energy level* k + <sup>n</sup> <sup>i</sup>=0 *r* (ei) is greater or equal to 0. Let EN(k) ⊆ *Runs* denote the k-energy objective, consisting of those runs that satisfy the k-energy condition.

The <sup>l</sup>*-storage condition* holds for a run <sup>s</sup>0e0s1e<sup>1</sup> ... if <sup>l</sup>+<sup>n</sup>−<sup>1</sup> <sup>i</sup>=<sup>m</sup> *r* (si, s<sup>i</sup>+1) ≥ 0 holds for every infix smems<sup>m</sup>+1 ...sn. Let ST(k,l) ⊆ *Runs* denote the k-energy l-storage objective, consisting of those runs that satisfy both the k-energy and the l-storage condition. We write ST(k) for - <sup>l</sup> ST(k,l). Clearly, ST(k) ⊆ EN(k).

*Mean-payoff* and *limit-payoff conditions* are defined w.r.t. the same reward function as the energy conditions. The *mean-payoff* value of a run ρ = s0e0s1e<sup>1</sup> ... is *MP*(ρ) def = lim inf<sup>n</sup>→∞ <sup>1</sup> n <sup>n</sup>−<sup>1</sup> <sup>i</sup>=0 *<sup>r</sup>* (ei). For 3∈{>, <sup>≥</sup>, <sup>=</sup>, <sup>≤</sup>, <} and <sup>c</sup> <sup>∈</sup> <sup>R</sup> <sup>∪</sup> {−∞, ∞}, the set MP(3c) ⊆ *Runs* consists of all runs ρ with *MP*(ρ)3c. Let LimInf(3c) <sup>⊆</sup> *Runs* contain all runs <sup>ρ</sup> with (lim inf<sup>n</sup>→∞ <sup>n</sup> <sup>i</sup>=0 *r* (ei))3c, and likewise for LimSup(3c).

The combined energy-parity objective EN(k) ∩ PAR is Borel and therefore weakly determined, meaning that it has a well-defined (inf sup = sup inf) value for every game [39,38]. Moreover, the almost-sure energy-parity objective (asking to win with probability 1) is even strongly determined [37]: either maximizer has a strategy to enforce the condition with probability 1 or minimizer has a strategy to prevent this.

#### **3 Characterizing Energy-Parity via Gain and Bailout**

The main theorem of this section (Theorem 5) characterizes almost sure energyparity objectives in terms of two intermediate objectives called Gain and k-Bailout for parameters k ≥ 0. This will form the basis of all computability results: we will show (as Theorems 14, 17 and 18) how to compute almost-sure sets for these intermediate objectives.

**Definition 2.** *Consider a finite SSG* G = (V,E,λ)*, as well as reward and parity functions defining the objectives* PAR, LimInf(> −∞), LimSup(= ∞) *as well as* ST(k,l) *and* EN(k) *for every* k,l <sup>∈</sup> <sup>N</sup>*. We define combined objectives* Gain *and* k*-*Bailout def = ∪<sup>l</sup>Bailout(k,l) *where*

$$\begin{array}{rcl} \mathsf{Gain} & \stackrel{\text{def}}{=} & \mathsf{Limlnf}(>-\infty) \cap \mathsf{PAR} \\ \mathsf{Bailout}(k,l) & \stackrel{\text{def}}{=} & (\mathsf{ST}(k,l) \cap \mathsf{PAR}) \cup (\mathsf{EN}(k) \cap \mathsf{LimSup}(=\infty)) .\end{array}$$

The main idea behind these two objectives is a special witness property for energy-parity. We argue that, if maximizer has an almost-sure winning strategy for energy-parity then he also has one that combines two almost-sure winning strategies, one for Gain and one for k-Bailout.

Notice that playing an almost-sure winning strategy for Gain implies a uniformly lower-bounded strictly positive chance that the energy level never drops below zero (assuming it is sufficiently high to begin with). This fact uses the finiteness of the set of control-states and does not hold for infinite-state MDPs. In the unlikely event that the energy level does get close to zero, maximizer switches to playing an almost sure winning strategy for k-Bailout. This is a disjunction of two scenarios, and the balance might be influenced by minimizer's choices. In the first scenario (ST(k,l) ∩ PAR) the energy never drops much and stays above zero (thus satisfying energy-parity). In the second scenario, (EN(k) ∩ LimSup(= ∞)), the parity objective is temporarily suspended in favor of boosting (while always staying above zero) the energy to a sufficiently high level to switch back to the strategy for Gain and thus try again from the beginning. The probability of infinitely often switching between these modes is zero due to the lower-bounded chance of success in the Gain phase. Therefore, maximizer eventually wins by playing for Gain. Note that maximizer needs to remember the current energy level in order to know when to switch and consequently, this strategy uses infinite memory.

*Example 3.* Consider again the game in Fig. 1. The middle left state satisfies both Gain and k-Bailout objectives for all k ≥ 2 almost-surely. The respective winning strategies are to always go left for Gain or always go right for k-Bailout when at that state. Note that it neither satisfies 0-Bailout nor 1-Bailout objectives.

We define the subset W ⊆ V of states from which maximizer can almost surely win both Gain and k-Bailout (assuming sufficiently high initial energy), while at the same time ensuring that the play remains within this set of states. These are the states from which maximizer can win by freely combining individual strategies for the Gain and Bailout objectives.

**Definition 4.** *Given a finite SSG* G = (V,E,λ)*, let* W ⊆ V *be the largest subset of states satisfying the following condition*

$$W \subseteq \mathsf{AS} \left( \mathsf{Gain} \cap \mathbb{G}W \right) \cap \bigcup\_{k} \mathsf{AS} \left( k\text{-Bailout} \cap \mathbb{G}W \right),$$

This condition describes a fixed-point, and as it is easy to see that if two sets W<sup>1</sup> and W<sup>2</sup> are such fixed-points, then so is W<sup>1</sup> ∪ W2. Thus, the maximal fixed-point W is well-defined.

Our main characterization of almost-sure energy-parity objectives is the following Theorem 5. It states that maximizer can almost surely win an EN(k) ∩ PAR objective if, and only if, he can win the easier k-Bailout objective while always staying in the safe set W.

**Theorem 5.** *For every* <sup>k</sup> <sup>∈</sup> <sup>N</sup>*,* AS (EN(k) <sup>∩</sup> PAR) = AS (k*-*Bailout <sup>∩</sup> <sup>G</sup>W)*.*

Our proof of this characterization theorem relies on the following claim, which allows to lift the existence of finite-memory deterministic optimal strategies from MDPs to SSGs. It applies to a fairly general class of objectives and, we believe, is of independent interest.

Recall that Obj def = *Runs* \ Obj denotes the complement of objective Obj. For runs a, b, c ∈ *Runs* we say that a is a *shuffle* of b and c if there exist factorizations b = b0b<sup>1</sup> ... and c = c0c<sup>1</sup> ... such that a = b0c0b1c<sup>1</sup> ... . An objective Obj is called *submixing* if, for every run a ∈ Obj that is a shuffle of runs b and c, either b ∈ Obj or c ∈ Obj. Obj is *shift-invariant* if, for every run s1e1s2e<sup>2</sup> ..., it holds that s1e1s2e<sup>2</sup> ... ∈ Obj ⇐⇒ s2e<sup>2</sup> ... ∈ Obj. Shift-invariance slightly generalizes the better-known *tail* condition (see [34] for a discussion).

**Theorem 6.** *Let* O *be an objective such that* O *is both shift-invariant and submixing. If maximizer has optimal FD strategies (from any state* s*) for* O *for every finite MDP then maximizer has optimal FD strategies (from any state* s*) for* O *for every finite SSG.*

This applies in particular to the Gain objective, but not to k-Bailout objectives, as these are not shift-invariant. A proof of Theorem 6 can be found in [41]. It uses a recursive argument based on the notion of *reset strategies* from [34].

The remainder of this section is dedicated to proving Theorem 5. We will first collect the remaining technical claims about Gain, Bailout, and reachability objectives. Most notably, as Lemma 8, we show that if maximizer can almost surely win Gain in a SSG, then he can do so using a FD strategy which moreover satisfies an energy-parity objective with strictly positive (and lower-bounded) probability. This is shown in part based on Theorem 6 applied to the Gain objective. We will also need the following fact about reachability objectives in finite MDPs.

**Lemma 7 ([8, Lemma 3.9]).** *Let* M *be a finite MDP and Reach*<sup>T</sup> *be the reachability objective with target* T def = {s | Val<sup>s</sup>- (LimInf(= −∞))=1}*. One can compute a rational constant* c < 1 *and an integer* h ≥ 0 *such that for all states* s *and* <sup>i</sup> <sup>≥</sup> <sup>h</sup> *we have* <sup>∀</sup>τ. <sup>P</sup><sup>τ</sup> <sup>s</sup> (EN(i) <sup>∩</sup> *Reach*<sup>T</sup> ) <sup>≤</sup> <sup>c</sup><sup>i</sup> <sup>1</sup>−<sup>c</sup> *.*

**Lemma 8.** *Consider a finite SSG* G = (V,E,λ) *where* Gain *holds a.s. for every state* <sup>s</sup> <sup>∈</sup> <sup>V</sup> *. Then, for every* <sup>δ</sup> <sup>∈</sup> [0, 1) *and* <sup>s</sup> <sup>∈</sup> <sup>V</sup> *, there exists a* <sup>ˆ</sup> <sup>k</sup> <sup>∈</sup> <sup>N</sup> *and an FD strategy* σˆ *s.t.*

*1.* <sup>∀</sup>τ. <sup>P</sup>σ,τ <sup>ˆ</sup> <sup>s</sup> (Gain)=1*, and 2.* <sup>∀</sup>τ. <sup>P</sup>σ,τ <sup>ˆ</sup> <sup>s</sup> (EN(ˆ k) ∩ PAR) ≥ δ*.*

*Proof.* Fix a δ ∈ [0, 1) and a state s ∈ V . Both LimInf(= −∞), as well as PAR objectives are *shift-invariant* and *submixing*, and therefore also the union has both these properties. It follows that Gain = LimInf(> −∞) ∩ PAR = LimInf(= −∞) ∪ PAR is both shift-invariant and submixing, since the complement of a parity objective is also a parity objective. By Lemma 16 and Theorem 6, there

exists an almost-sure winning FD strategy σˆ for maximizer for the objective Gain from <sup>s</sup>, i.e., <sup>∀</sup>τ. <sup>P</sup>σ,τ <sup>ˆ</sup> <sup>s</sup> (Gain) = 1, thus yielding Item 1.

Let M be the MDP obtained from G by fixing the strategy σˆ for maximizer from <sup>s</sup>. Since <sup>G</sup> is finite and <sup>σ</sup><sup>ˆ</sup> is FD, also <sup>M</sup> is finite. In <sup>M</sup> we have <sup>∀</sup>τ. <sup>P</sup><sup>τ</sup> <sup>s</sup> (Gain) = 1. In particular, in <sup>M</sup>, the set <sup>T</sup> def = {s | Val<sup>s</sup>- (LimInf(= −∞))=1} is not reachable, i.e., <sup>∀</sup>τ. <sup>P</sup><sup>τ</sup> <sup>s</sup> (*Reach*<sup>T</sup> ) = 0.

By Lemma 7, in <sup>M</sup> there exists a horizon <sup>h</sup> <sup>∈</sup> <sup>N</sup> and a constant c < 1 such that for all <sup>i</sup> <sup>≥</sup> <sup>h</sup> we have <sup>∀</sup>τ. <sup>P</sup><sup>τ</sup> <sup>s</sup> (EN(i) <sup>∩</sup> *Reach*<sup>T</sup> ) <sup>≤</sup> <sup>c</sup><sup>i</sup> <sup>1</sup>−<sup>c</sup> . Since <sup>T</sup> cannot be reached in <sup>M</sup>, the condition *Reach*<sup>T</sup> evaluates to *true* and we have <sup>∀</sup>τ. <sup>P</sup><sup>τ</sup> <sup>s</sup> (EN(i)) ≥ <sup>1</sup> <sup>−</sup> <sup>c</sup><sup>i</sup> <sup>1</sup>−<sup>c</sup> . Since c < 1 and δ < 1, we can pick a sufficiently large <sup>ˆ</sup> k ≥ h such that 1 <sup>−</sup> <sup>c</sup>k<sup>ˆ</sup> <sup>1</sup>−<sup>c</sup> <sup>≥</sup> <sup>δ</sup> and obtain <sup>∀</sup>τ. <sup>P</sup><sup>τ</sup> <sup>s</sup> (EN(ˆ k)) ≥ δ in M. Moreover, the above property <sup>∀</sup>τ. <sup>P</sup><sup>τ</sup> <sup>s</sup> (Gain) = 1 in particular implies <sup>∀</sup>τ. <sup>P</sup><sup>τ</sup> <sup>s</sup> (PAR) = 1. Thus we obtain <sup>∀</sup>τ. <sup>P</sup><sup>τ</sup> <sup>s</sup> (EN(ˆ k) ∩ PAR) ≥ δ in M.

Back in the SSG <sup>G</sup>, we have <sup>∀</sup>τ. <sup>P</sup>σ,τ <sup>ˆ</sup> <sup>s</sup> (EN(ˆ k) ∩ PAR) ≥ δ as required for Item 2.

**Lemma 9.** EN(k) ∩ PAR ⊆ k*-*Bailout*.*

*Proof.* Let ρ be a run in EN(k) ∩ PAR. There are two cases. In the first case we have ρ ∈ ∪<sup>l</sup>ST(k,l) ∩ PAR and thus directly ρ ∈ k-Bailout. Otherwise, ρ /∈ ∪<sup>l</sup>ST(k,l)∩PAR. Since ρ ∈ PAR, we must have ρ /∈ ∪<sup>l</sup>ST(k,l). Since ρ ∈ EN(k), it follows that <sup>ρ</sup> does not satisfy the <sup>l</sup>-storage condition for any <sup>l</sup> <sup>∈</sup> <sup>N</sup>. So, for every <sup>l</sup> <sup>∈</sup> <sup>N</sup>, there exists an infix <sup>ρ</sup> of <sup>ρ</sup> s.t. <sup>l</sup>+*<sup>r</sup>* (ρ ) < 0. Let ρ be the prefix of ρ before ρ . Since ρ ∈ EN(k) we have k+*r* (ρρ ) ≥ 0 and thus *r* (ρ) ≥ −k−*r* (ρ ) > −k+l. To summarize, if ρ /∈ ∪<sup>l</sup>ST(k,l) ∩ PAR then, for every l, it has a prefix ρ with *r* (ρ) > −k + l. Thus ρ ∈ LimSup(= ∞). Thus ρ ∈ k-Bailout.

We now define W as the set of states that are almost-sure winning for energy-parity with some sufficiently high initial energy level. (W is also called the winning set for the unknown initial credit problem.)

**Definition 10.** W def = - <sup>k</sup> AS (EN(k) ∩ PAR)*.*

#### **Lemma 11.**

*1.* AS (EN(k) <sup>∩</sup> PAR) <sup>⊆</sup> AS (Gain <sup>∩</sup> <sup>G</sup>W )

*2.* AS (EN(k) <sup>∩</sup> PAR) <sup>⊆</sup> AS (k*-*Bailout <sup>∩</sup> <sup>G</sup>W )

*Proof.* Let s ∈ AS (EN(k) ∩ PAR) and σ a strategy that witnesses this property. Except for a null-set, all runs ρ = se0s1e<sup>1</sup> ...e<sup>n</sup>−<sup>1</sup>s<sup>n</sup> ... from s induced by σ satisfy EN(k) ∩ PAR.

Let ρ = se0s1e<sup>1</sup> ...s<sup>m</sup> be a finite prefix of ρ. For every n ≥ 0 we have k + <sup>n</sup>−<sup>1</sup> <sup>i</sup>=0 *r* (ei) ≥ 0, since ρ ∈ EN(k). In particular this holds for all n ≥ m. So, for every <sup>n</sup> <sup>≥</sup> <sup>m</sup>, we have <sup>k</sup> <sup>+</sup> <sup>m</sup>−<sup>1</sup> <sup>i</sup>=0 *<sup>r</sup>* (ei) + <sup>n</sup>−<sup>1</sup> <sup>i</sup>=<sup>m</sup> *r* (ei) ≥ 0. Therefore s<sup>m</sup> ∈ AS (EN(k ) <sup>∩</sup> PAR), where <sup>k</sup> <sup>=</sup> <sup>k</sup> <sup>+</sup> <sup>m</sup>−<sup>1</sup> <sup>i</sup>=0 *r* (ei), as witnessed by playing σ with history se0s1e<sup>1</sup> ...s<sup>m</sup> from sm. Thus s<sup>m</sup> ∈ - <sup>k</sup> AS (EN(k) ∩ PAR) = W , i.e., almost all σ-induced runs ρ satisfy GW .

Towards Item 1, we have EN(k) ⊆ LimInf(> −∞) and thus EN(k) ∩ PAR ⊆ LimInf(<sup>&</sup>gt; −∞) <sup>∩</sup> PAR <sup>=</sup> Gain. Therefore <sup>σ</sup> witnesses <sup>s</sup> <sup>∈</sup> AS (Gain <sup>∩</sup> <sup>G</sup>W ).

Towards Item 2, we have EN(k) ∩ PAR ⊆ k-Bailout by Lemma 9. Thus σ witnesses <sup>s</sup> <sup>∈</sup> AS (k-Bailout <sup>∩</sup> <sup>G</sup>W ).

**Lemma 12.** W ⊆ W*.*

*Proof.* It suffices to show that W satisfies the monotone condition imposed on W (cf. Definition 4), since W is defined as the largest set satisfying this condition.

Let s ∈ W = - <sup>k</sup> AS (EN(k) <sup>∩</sup> PAR). Then <sup>s</sup> <sup>∈</sup> AS EN(ˆ <sup>k</sup>) <sup>∩</sup> PAR for some fixed ˆ <sup>k</sup>. By Lemma 11(1) we have <sup>s</sup> <sup>∈</sup> AS (Gain <sup>∩</sup> <sup>G</sup>W ). By Lemma 11(2) we have <sup>s</sup> <sup>∈</sup> AS ˆ <sup>k</sup>-Bailout <sup>∩</sup> <sup>G</sup>W ⊆ - <sup>k</sup> AS (k-Bailout <sup>∩</sup> <sup>G</sup>W ).

*Proof of Theorem 5.* Towards the ⊆ inclusion, we have

$$\mathsf{AS}\left(\mathsf{EN}(k)\cap\mathsf{PAR}\right)\subseteq\mathsf{AS}\left(k\text{-}\mathsf{Bailout}\cap\mathsf{G}W'\right)\subseteq\mathsf{AS}\left(k\text{-}\mathsf{Bailout}\cap\mathsf{G}W\right)$$

by Lemma 11(2) and Lemma 12.

Towards the <sup>⊇</sup> inclusion, let <sup>s</sup> <sup>∈</sup> AS (k-Bailout <sup>∩</sup> <sup>G</sup>W) and <sup>σ</sup><sup>1</sup> be a strategy that witnesses this. We show that s ∈ AS (EN(k) ∩ PAR). We now consider the modified SSG G = (W, E, λ) with the state set restricted to W. In particular, s ∈ W and σ<sup>1</sup> witnesses s ∈ AS (k-Bailout) in G . We now construct a strategy σ that witnesses s ∈ AS (EN(k) ∩ PAR) in G , and thus also in G. The strategy σ will use infinite memory to keep track of the current energy level of the run.

Apart from σ1, we require several more strategies as building blocks for the construction of σ.

First, in <sup>G</sup> we had <sup>∀</sup>s <sup>∈</sup> W. s <sup>∈</sup> AS (Gain <sup>∩</sup> <sup>G</sup>W), and thus in <sup>G</sup> we have ∀s ∈ W. s ∈ AS (Gain). For every s ∈ W we instantiate Lemma 8 for G with δ = 1/2 and obtain a number ˆ k<sup>s</sup> and a strategy ˆσ<sup>s</sup>with


Let k<sup>1</sup> def = max{<sup>ˆ</sup> k<sup>s</sup>- | s ∈ W}. The strategies ˆσ<sup>s</sup>are called *gain strategies*.

Second, by the finiteness of V , there is a minimal number k<sup>2</sup> such that - <sup>k</sup> AS (k-Bailout <sup>∩</sup> <sup>G</sup>W) <sup>=</sup> - <sup>k</sup>≤k<sup>2</sup> AS (k-Bailout <sup>∩</sup> <sup>G</sup>W) in <sup>G</sup>. Therefore, in <sup>G</sup> we have that

$$W \subseteq \bigcup\_k \mathsf{AS}\left(k\text{-Bailout}\right) = \bigcup\_{k \le k\_2} \mathsf{AS}\left(k\text{-Bailout}\right) = \mathsf{AS}\left(k\_2\text{-Bailout}\right).$$

Thus in G for every s ∈ W there exists a strategy σ˜<sup>s</sup> with <sup>∀</sup>τ. <sup>P</sup><sup>σ</sup>˜s- ,τ s- (k2-Bailout) = 1. The strategies σ˜<sup>s</sup> are called *bailout strategies*. Let k def = k<sup>1</sup> + k<sup>2</sup> − k + 1. We now define the strategy σ.

**Start:** First σ plays like σ<sup>1</sup> from s. Since σ<sup>1</sup> witnesses s ∈ AS (k-Bailout) against every minimizer strategy τ , almost all induced runs ρ = se0s1e<sup>1</sup> ... satisfy either


Almost all runs ρ of the latter type (B) (and potentially also some runs of type (A)) satisfy EN(k) and <sup>l</sup> <sup>i</sup>=0 *r* (ei) ≥ k eventually for some l. If we observe <sup>l</sup> <sup>i</sup>=0 *r* (ei) ≥ k for some prefix se0s1e<sup>1</sup> ...els of the run ρ then our strategy σ plays from s as described in the **Gain** part below. Otherwise, if we never observe this condition, then our run ρ is of type (A) and σ continues playing like σ1. Since property (A) implies (EN(k) ∩ PAR), this is sufficient.

**Gain:** In this case we are in the situation where we have reached some state s after some finite prefix ρ of the run, where *r* (ρ ) ≥ k . Our strategy σ now plays like the gain strategy σˆ<sup>s</sup>- , as long as *r* (ρ ) ≥ k −k<sup>1</sup> holds for the current prefix <sup>ρ</sup> of the run. By Item 2, this will satisfy <sup>∀</sup>τ. <sup>P</sup><sup>σ</sup>ˆs- ,τ s- (EN(ˆ k<sup>s</sup>- )∩PAR) ≥ 1/2 and thus <sup>∀</sup>τ. <sup>P</sup><sup>σ</sup>ˆs- ,τ s- (EN(k1) ∩ PAR) ≥ 1/2. It follows that with probability ≥ 1/2 we will keep playing σˆ<sup>s</sup> forever and satisfy PAR and always *r* (ρ ) ≥ k − k<sup>1</sup> and thus EN(k), since k + *r* (ρ ) ≥ k + k − k<sup>1</sup> = k<sup>2</sup> + 1 ≥ 0.

Otherwise, if eventually *r* (ρ ) = k − k<sup>1</sup> − 1 then we have k + *r* (ρ ) = k2. In this case (which happens with probability < 1/2) we continue playing as described in the **Bailout** part below.

	- **(A)** (∪<sup>l</sup>ST(k2, l) ∩ PAR), or
	- **(B)** (EN(k2) ∩ LimSup(= ∞)).

As long as *r* (ρ ) < k holds for the current prefix ρ of the run, we keep playing σ˜<sup>s</sup>-- . Otherwise, if eventually *r* (ρ ) ≥ k holds, then we switch back to playing the **Gain** strategy above. All the runs that never switch back to playing the **Gain** strategy must be of type (A) and thus satisfy PAR. Since we have k2-Bailout ⊆ EN(k2), it follows that, for every prefix ρ of the run from s, according to σ˜<sup>s</sup>- we have k<sup>2</sup> + *r* (ρ) ≥ 0. Thus, for every prefix ρ of ρ, we have k + *r* (ρ) = k + *r* (ρ ) + *r* (ρ) = k<sup>2</sup> + *r* (ρ) ≥ 0. Therefore, the EN(k) objective is satisfied by all runs.

As shown above, almost all runs induced by σ that eventually stop switching between the three modes satisfy EN(k) ∩ PAR. Switching from Gain/Bailout to Start is impossible, but switching from Gain to Bailout and back is possible. However, the set of runs that infinitely often switch between Gain and Bailout is a null-set, because the probability of switching from Gain to Bailout is ≤ 1/2. Thus, σ witnesses s ∈ AS (EN(k) ∩ PAR).

*Remark 13.* It follows from the results above that W = W. The ⊆ inclusion holds by Lemma 12. For the reverse inclusion we have

$$\begin{aligned} W &\subseteq \bigcup\_k \mathsf{AS}\left(k\text{-Bailout} \cap \mathbb{G}W\right) & \text{by Definition 4} \\ &= \bigcup\_k \mathsf{AS}\left(\mathsf{EN}(k) \cap \mathsf{PAR}\right) & \text{by Theorem 5} \\ &= W' & \text{by Definition 10.} \end{aligned}$$

### **4 Bailout**

In this section we will argue that it is possible decide, in NP and coNP, whether the bailout objective can be satisfied almost surely. More precisely, we show the existence of procedures to decide if, for a given <sup>k</sup> <sup>∈</sup> <sup>N</sup> and state <sup>s</sup>, there exists an <sup>l</sup> <sup>∈</sup> <sup>N</sup> such that <sup>s</sup> almost-surely satisfies the Bailout(k,l) objective

Bailout(k,l) def = (ST(k,l) <sup>∩</sup> PAR) <sup>∪</sup> (EN(k) <sup>∩</sup> LimSup(= <sup>∞</sup>)).

Recall that the idea behind the Bailout objective is that, during a game for energy-parity, maximizer is temporarily abandoning the parity (but not the energy) condition in order to increase the energy to a sufficient level (which will then allow him to try an a.s. strategy for Gain once more). However, in a stochastic game – as opposed to an MDP [40] – an opponent could possibly prevent this increase in energy level at the expense of satisfying the original energy-parity objective in the first place (cf. Example 1). The Bailout objective is designed to capture the disjunction of both outcomes, as both are favorable for the maximizer. The parameter k is the acceptable total energy drop (i.e., the initial value), and the parameter l is the acceptable energy drop on any infix of a play, which translates to the upper bound on the energy level in the second outcome.

The question can be phrased equivalently as membership of a control state s in the almost-sure set for the k-Bailout objective for a given game G and energy level <sup>k</sup> <sup>∈</sup> <sup>N</sup>.

**Theorem 14.** *One can check in* NP, coNP *and pseudo-polynomial time if, for a given SSG* <sup>G</sup> def <sup>=</sup> (V,E,λ)*,* <sup>k</sup> <sup>∈</sup> <sup>N</sup> *and control state* <sup>s</sup> <sup>∈</sup> <sup>V</sup> *, maximizer can almost-surely satisfy* k*-*Bailout *from* s*.*

*Moreover, there are* K, L <sup>∈</sup> <sup>N</sup>*, polynomial in* <sup>|</sup><sup>V</sup> <sup>|</sup> *and the largest absolute transition reward, so that* - <sup>k</sup>≥<sup>0</sup> AS<sup>G</sup> (k*-*Bailout) <sup>=</sup> AS<sup>G</sup> (Bailout(K, L))*. And so, checking whether state* s *belongs to* - <sup>k</sup>≥<sup>0</sup> AS<sup>G</sup> (k*-*Bailout) *is in* NP *and* coNP*.*

*Proof (sketch).* This is shown by a sequence of transformations of the game and ultimately reduced to a finding the winner of a non-stochastic game with an energy-parity objective, which is known to be solvable in NP, coNP and pseudopolynomial time [19]. One important observation is that it is possible to replace, without changing the outcome, the energy EN(k) condition in the Bailout(k,l) objective by the more restrictive energy-storage ST(k,l) condition. See [41] for further details.

### **5 Gain**

In this section we will argue that it is possible to decide, in NP and coNP, whether the Gain objective (i.e., LimInf(> −∞) ∩ PAR) can be satisfied almost surely.

We start by investigating the strategy complexity of winning strategies for the Gain objective.

**Lemma 15.** *In every finite SSG, minimizer has optimal MD strategies for objective* Gain*.*

*Proof.* We show that maximizer has MD optimal strategies for LimInf(= −∞) ∪ PAR. This is equivalent to the claim of the lemma because LimInf(> −∞) ∩ PAR = LimInf(= −∞) ∪ PAR and the complement of a parity condition is itself a parity condition (with all priorities incremented by one).

We note that both LimInf(= −∞), as well as parity objectives PAR are shiftinvariant and submixing and therefore also that the union LimInf(= −∞) ∪ PAR has both these properties. The claim now follows from the fact that SSGs with objectives that are both submixing and shift-invariant admit MD optimal strategies for maximizer [34, Theorem 5.2].

Based on the results in [40] one can show a similar claim for maximizer strategies in MDPs.

**Lemma 16.** *For finite MDPs, almost-sure winning maximizer strategies for* Gain *can be chosen FD.*

Using the existence of MD optimal minimizer strategies (Lemma 15) and a coNP upper bound for checking almost sure Gain in MDPs established in [40], we can derive a coNP procedure. See [41] for full details.

**Theorem 17.** *Checking whether a state* s ∈ V *of a SSG satisfies* Gain *almostsurely is in* coNP*.*

The rest of this section will deal with the NP upper bound, which is the most challenging part of this paper. The crux of our proof is the observation that if maximizer has a strategy that wins almost surely against all MD minimizer strategies, then he wins almost surely. This is because one of these MD strategies is optimal due to Lemma 15. We show that, in order to witness such an almost-sure winning strategy for maximizer in SSG G, it suffices to provide a polynomially larger SSG G3, together with an almost-sure winning strategy for the *storageparity* objective (see Theorem 21 in Section 6) in G3. This will give us an NP algorithm, because G3, along with its winning strategy, can be guessed and verified in polynomial time. Formally we claim that:

**Theorem 18.** *Checking whether a state* s ∈ V *of* G *satisfies* Gain *almost-surely is in* NP*.*

*Proof.* (sketch) For technical convenience, we will assume w.l.o.g. that every SSG henceforth is in a normal form, where every random state has only one predecessor, which is owned by the maximizer. To show the existence of G3, we are going to introduce two intermediate games: G<sup>1</sup> and G2. These games are never constructed by our NP algorithm, but are just defined to break down the complex construction of G<sup>3</sup> into more manageable steps.

Intuitively, G<sup>1</sup> is just G where all rewards on edges are multiplied by a large enough factor, f, to turn strategies with a mean-payoff > 0 into ones with mean-payoff > 2. G<sup>2</sup> is an extension of G<sup>1</sup> where the maximizer is given a choice before every visit to a probabilistic node. He can either let the game proceed as before, or sacrifice part of his one-step reward in exchange for a more evenly balanced reward outcome, so the energy can no longer drop arbitrarily low when a probabilistic cycle is reached. As a result, in G<sup>2</sup> it suffices to consider a storage-parity objective (see Theorem 21 in Section 6) instead of Gain. The number of choices maximizer is given is the number of MD minimizer strategies, which clearly can be exponential. That would not suffice for an NP algorithm. Therefore, we show that most of these choices are redundant and can be removed without impairing the almost sure wining region. As the result of that pruning, we obtain G<sup>3</sup> of polynomial size.

For the the technical details of the G→G<sup>1</sup> → G<sup>2</sup> → G<sup>3</sup> constructions please see [41]. Figure 2 shows how these transformations may look like.

#### **6 The Main Results**

In this section, we prove the main results of the paper, namely that almost-sure energy parity stochastic games can be decided in NP and coNP. The proofs are straightforward and follow from the much more involved characterization of almost sure energy parity objective in terms of the Bailout and Gain objectives established in Section 3 and their computational complexity analysis in Sections 4 and 5, respectively.

**Theorem 19.** *Given an SSG, energy level* k∗*, checking if a state* s *is almost-sure winning for* EN(k∗) ∩ PAR *is in* NP ∩ coNP*.*

*Proof.* Recall that we can compute the set W from Definition 4 by iterating

$$W\_i \quad \stackrel{\text{def}}{=} \mathsf{AS} \left( \mathsf{Gain} \cap \mathsf{G}W\_{i-1} \right) \cap \bigcup\_k \mathsf{AS} \left( k\text{-Bailout} \cap \mathsf{G}W\_{i-1} \right),$$

starting with W<sup>0</sup> def = V , until we reach the greatest fixed point W. Note that at step i we need to solve almost sure Gain and almost sure - <sup>k</sup> AS (k-Bailout), where the states of the game are restricted to W<sup>i</sup>−<sup>1</sup>. There can be at most |V | steps, because at least one state is removed in each iteration.

Fig. 2: An example game G (left) and the derived games. The strategy that always loops in the right-most state of G ensures a mean-payoff of 3. As this is the only MD strategy for maximizer that ensures a positive mean-payoff, a factor f = 1 is sufficient here and we have G<sup>1</sup> = G. In the derived game G<sup>2</sup> in Fig. 2b there are as many trade-in options for the random state as there are MD minimizer's strategies in G<sup>1</sup> (just two in this example). The blue one (top left) corresponds to minimizer going left and the red one (top right) to going up in G1. Maximizer almost-surely wins Gain in G iff he almost-surely wins a storage-parity condition (see Theorem 21) in G3.

It then suffices to check AS (k-Bailout <sup>∩</sup> <sup>G</sup>W) (i.e., AS (k-Bailout) for the subgame that consists only of the states of the fixed point W for k = k∗. Note that this step can be skipped if k<sup>∗</sup> ≥ K, the bound from Theorem 14.

Before we discuss how to use NP and coNP procedures to construct these sets and to conduct the final test on the fixed point <sup>W</sup>, we note that the '∩GW<sup>i</sup>−1' does not add anything substantial, as these are simply the same tests and procedures conducted on the subgame that only consist of the states of W<sup>i</sup>−<sup>1</sup>.

To obtain an NP procedure for constructing AS (Gain)—or, as remarked above, AS (Gain <sup>∩</sup> <sup>G</sup>W<sup>i</sup>−<sup>1</sup>)—we can guess and validate its membership for each state s *in* this set, using the NP result from Theorem 18, and we can guess and validate its non-membership for each state s *not in* this set in NP, using the coNP result from Theorem 17. Similarly, we can guess and validate both the membership and the non-membership in - <sup>k</sup> AS (k-Bailout <sup>∩</sup> <sup>G</sup>W<sup>i</sup>−<sup>1</sup>)—and of - <sup>k</sup> AS (k-Bailout <sup>∩</sup> <sup>G</sup>W<sup>i</sup>−<sup>1</sup>) by analysing the subgame with only the states in W<sup>i</sup>−<sup>1</sup>—by using the NP and coNP result, respectively, from Theorem 14.

Once we can construct these sets, we can also intersect them and check if a fixed point has been reached. (One can, of course, stop when s /∈ Wi.)

We can now conduct the final check in NP using Theorem 18.

A coNP algorithm that constructs W can be designed analogously: once W<sup>i</sup>−<sup>1</sup> is known, membership and non-membership of a state <sup>s</sup> in AS (Gain <sup>∩</sup> <sup>G</sup>W<sup>i</sup>−<sup>1</sup>) can be guessed and validated in coNP by Theorem 17 and by Theorem 18, respectively; and membership or non-membership of a state in - <sup>k</sup> AS (k-Bailout <sup>∩</sup> <sup>G</sup>W<sup>i</sup>−<sup>1</sup>) can

be guessed and validated in coNP using the coNP and NP part, respectively, of Theorem 14.

Once W is constructed, we can conduct the final check in coNP using Theorem 17.

This result, together with the upper bound on the energy needed to win energy-parity objective, allows us to solve the "unknown initial energy problem" [7], which is to compute the minimal initial energy level required.

**Corollary 20.** *For any state* s*, checking if there is* k *such that* AS (EN(k) ∩ PAR) *holds is in* NP ∩ coNP*. Also, for a given* k∗*, checking if* k<sup>∗</sup> *is the minimal energy level required to win almost surely is in* NP ∩ coNP *as well.*

*Proof.* Due to Theorem 14, if there is an energy level k for which AS (EN(k) ∩ PAR) holds, then it also holds for the bound K whose size is polynomial in the size of the game. We can then simply calculate K and then use NP and coNP algorithms from Theorem 19 for AS (EN(K) ∩ PAR).

As for the second claim, note that checking whether maximizer cannot win almost surely EN(k) ∩ PAR is also in NP and coNP as a complement of a coNP and an NP set, respectively. Therefore, for an NP/coNP upper bound it suffices to simultaneously guess certificates for almost surely EN(k∗) ∩ PAR and not almost surely EN(k<sup>∗</sup> − 1) ∩ PAR and verify them in polynomial time.

Finally, let us mention that the slightly more restrictive *storage-parity* objectives can also be solved in NP∩coNP. These are almost identical to energy-parity except that, in addition, there must exist some bound <sup>l</sup> <sup>∈</sup> <sup>N</sup> such that the energy level never drops by more than l during a run. This extra condition ensures that, if the storage-parity objective holds almost-surely, then there must exist a *finite-memory* winning strategy for maximizer.

**Theorem 21.** *One can check in* NP, coNP *and pseudo-polynomial time if, for a given SSG* <sup>H</sup> def <sup>=</sup> (V,E,λ)*,* <sup>k</sup> <sup>∈</sup> <sup>N</sup> *and control state* <sup>s</sup> <sup>∈</sup> <sup>V</sup> *, maximizer can almost-surely satisfy* ST(k) ∩ PAR *from* s*.*

*Moreover, there is a bound* <sup>L</sup> <sup>∈</sup> <sup>N</sup>*, polynomial in the number of states and the largest absolute transition reward, so that* ST(k) ∩ PAR = ST(k,L) ∩ PAR*.*

*Proof.* (sketch) This result follows by a simple adaptation of the proofs showing the same computational complexity of the Bailout objective (Section 4). See [41] for further details.

*Example 22.* In the game in Fig. 1, maximizer cannot ensure the storage-parity condition ST(k)∩PAR for any initial energy level k. This is because it would imply the existence of a finite-memory almost-surely winning strategy, which as we have already argued, cannot be true. More intuitively, to prevent an intermediate energy drop by l units, a winning maximizer strategy for storage-parity would need to stop moving left after observing the negative cycle in the leftmost state l successive times. However, when maximizer moves right, this gives minimizer the chance to visit the rightmost bad state (with dominating odd priority 1). The

chance of that happening is (1/3)<sup>l</sup> > 0. In particular, this probability is > 0 for any value of the intermediate energy drop l. Therefore, for any fixed l, maximizer would need to move right infinitely often to satisfy storage and lose (against an optimal minimizer strategy that moves to the rightmost state).

### **7 Conclusion and Outlook**

We showed that several almost-sure problems for combined energy-parity objectives in simple stochastic games are in NP ∩ coNP. No pseudo-polynomial algorithm is known (just like for stochastic mean-payoff parity games [20]). All these problems subsume (stochastic) parity games, by setting all rewards to 0. Thus the existence of a pseudo-polynomial algorithm would imply that (stochastic and non-stochastic) parity games are in P, which is a long-standing open problem.

It is known that maximizer already needs infinite memory to win almostsurely a combined energy-parity objective in MDPs [40]. Our results do not imply anything about the memory requirement for optimal minimizer strategies in SSGs for this objective. We conjecture that memoryless minimizer strategies suffice. If this conjecture holds (and is proven), this would greatly simplify the coNP upper bound that we established for this problem.

A natural question is whether results on mean-payoff/energy/parity games can be generalized to a setting with multi-dimensional payoffs. Non-stochastic multi-mean-payoff and multi-energy games have been studied in [48,36,1]. To the best of our knowledge, the techniques used there, e.g. upper bounds on the necessary energy levels as in [36], do not generalize to stochastic games (or MDPs).

Multiple mean-payoff objectives in MDPs have been studied in [10,24], but the corresponding multi-energy (resp. multi-energy-parity) objective has extra difficulties due to the 0-boundary condition on the energy. I.e., even on Markov chains, and without any parity condition, it subsumes problems about multidimensional random walks. Some partial results on Markov chains and MDPs have been obtained in [13,2,3], but the decidability of the almost-sure problem for stochastic multi-energy-parity games (and MDPs) remains open.

### **Acknowledgments**

The work of all the authors was supported in part by EPSRC grant EP/M027287/1. Sven Schewe and Dominik Wojtczak were also supported by EPSRC grant EP/P020909/1.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4. 0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Nondeterministic Syntactic Complexity**

Robert S. R. Myers, Stefan Milius<sup>1</sup>*,*, and Henning Urbat ( )<sup>1</sup>*,*

1 Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany my.robmyers@gmail.com, {stefan.milius,henning.urbat}@fau.de

**Abstract** We introduce a new measure on regular languages: their *nondeterministic syntactic complexity*. It is the least degree of any extension of the 'canonical boolean representation' of the syntactic monoid. Equivalently, it is the least number of states of any *subatomic* nondeterministic acceptor. It turns out that essentially all previous structural work on nondeterministic state-minimality computes this measure. Our approach rests on an algebraic interpretation of nondeterministic finite automata as deterministic finite automata endowed with semilattice structure. Crucially, the latter form a self-dual category.

### **1 Introduction**

Regular languages admit a plethora of equivalent representations: finite automata, finite monoids, regular expressions, formulas of monadic second-order logic, and numerous others. In many cases, the most succinct representation is given by a *nondeterministic finite automaton (nfa)*. Therefore, the investigation of stateminimal nfas is of both computational and mathematical interest. However, this turns out to be surprisingly intricate; in fact, the task of minimizing an nfa, or even of deciding whether a given nfa is minimal, is known to be PSPACE-complete [23]. One intuitive reason is that minimal nfas lack structure: a language may have many non-isomorphic minimal nondeterministic acceptors, and there are no clearly identified and easily verifiable mathematical properties distinguishing them from non-minimal ones. As a consequence, all known algorithms for nfa minimization (and related problems such as inclusion or universality testing) require some form of exhaustive search [9, 11,26]. This sharply contrasts the situation for minimal *deterministic finite automata (dfa)*: they can be characterized by a universal property making them unique up to isomorphism, which immediately leads to efficient minimization.

In the present paper, we work towards the goal of bringing more structure into the theory of nondeterministic state-minimality. To this end, we propose a novel algebraic perspective on nfas resting on *boolean representations* of monoids, i.e. morphisms *<sup>M</sup>* <sup>→</sup> **JSL**(*S, S*) from a monoid *<sup>M</sup>* into the endomorphism monoid

Supported by Deutsche Forschungsgemeinschaft (DFG) under projects MI 717/5-2 and MI 717/7-1, and as part of the Research and Training Group 2475 "Cybercrime and Forensic Computing" (393541319/GRK2475/1-2019)

Supported by Deutsche Forschungsgemeinschaft (DFG) under proj. SCHR 1118/8-2 © The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 448–468, 2021.

https://doi.org/10.1007/978-3-030-71995-1\_23

of a finite join-semilattice *S*. Our focus lies on quotient monoids of the free monoid *Σ*<sup>∗</sup> recognizing a given regular language *L* ⊆ *Σ*∗. The largest such monoid is *Σ*<sup>∗</sup> itself, while the smallest one is the *syntactic monoid* syn(*L*). For both of them, *L* induces a *canonical boolean representation*

*<sup>Σ</sup>*<sup>∗</sup> <sup>→</sup> **JSL**(SLD(*L*)*,* SLD(*L*) and syn(*L*) <sup>→</sup> **JSL**(SLD(*L*)*,* SLD(*L*))

on the semilattice SLD(*L*) of all finite unions of left derivatives of *L*. The first representation gives rise to an algebraic characterization of minimal nfas:

**Theorem.** The size of a state-minimal nfa for *L* equals the least degree of any extension of the canonical representation of *Σ*<sup>∗</sup> induced by *L*.

Here, the *degree* of a representation refers to the number of join-irreducibles of the underlying semilattice. In the light of this result, it is natural to ask for an analogous automata-theoretic perspective on the canonical representation of syn(*L*) and its extensions. For this purpose, we introduce the class of *subatomic* nfas, a generalization of *atomic* nfas earlier introduced by Brzozowski and Tamm [6]. In order to get a handle on them, we employ an algebraic framework that interprets nfas in terms of **JSL***-dfas*, i.e. deterministic finite automata in the category of semilattices. In this setting, the semilattice SLD(*L*) used in the canonical representations naturally arises as the *minimal* **JSL**-dfa for the language *L*. We shall demonstrate that much of the structure theory of (sub-)atomic nfas reduces to the observation that the category of **JSL**-dfas is *self-dual*. Our main result gives an algebraic characterization of minimal subatomic nfas:

**Theorem.** The size of a state-minimal subatomic nfa for *L* equals the least degree of any extension of the canonical representation of syn(*L*).

We call the measure suggested by the above theorem the *nondeterministic syntactic complexity* of the language *L*. It turns out to be extremely natural: as illustrated in Section 5, essentially all existing work on the structure of stateminimal nfas implicitly identifies classes of languages whose nondeterministic state complexity equals their nondeterministic syntactic complexity, and thus is actually concerned with computing minimal subatomic acceptors.

### **2 Preliminaries**

We start by introducing some notation and terminology used in the paper.

*Semilattices.* A *(join-)semilattice* is a poset (*S,* ≤*<sup>S</sup>*) in which every finite subset *X* ⊆ *S* has a least upper bound, a.k.a. join, denoted by *X*. A *morphism* of semilattices is a map preserving all finite joins. Let **JSL** denote the category of join-semilattices and their morphisms. An element *j* of a semilattice *S* is *join-irreducible* if for all finite subsets *X* ⊆ *S* with *j* = *X* one has *j* ∈ *X*. Let

$$J(S) = \{ \ j \in S \; : \; j \text{ is join-irreducible} \}.$$

Let 2 = {0*,* 1} denote the two-element semilattice with 0 ≤ 1. Since 2 ∼= (P(1)*,* ⊆) is the free semilattice on a single generator, morphisms from 2 into a semilattice *S*

correspond uniquely to elements of *S*. Similarly, a morphism *f* : *S* → 2 corresponds uniquely to a *prime filter <sup>F</sup>* <sup>=</sup> *<sup>f</sup>* <sup>−</sup><sup>1</sup>[1] <sup>⊆</sup> *<sup>S</sup>*, i.e. an upwards closed subset such that *X* ∈ *F* implies *X* ∩ *F* = ∅ for every finite subset *X* ⊆ *S*. If *S* is finite, prime filters are precisely the sets *F* = {*s* ∈ *S* : *s* ≤ *s*0} for *s*<sup>0</sup> ∈ *S*. If *S* is a subsemilattice of a semilattice *T*, every prime filter *F* of *S* can be extended to the prime filter *T* \ (↓(*S* \ *F*)) of *T*, where ↓ *X* = { *t* ∈ *T* : *t* ≤ *x* for some *x* ∈ *X* } denotes the down-closure of a subset *X* ⊆ *T*. Equivalently, every morphism *f* : *S* → 2 can be extended to a morphism *g* : *T* → 2. In category-theoretic terminology, this means that the semilattice 2 forms an injective object of **JSL**.

The category **JSL**<sup>f</sup> of finite semilattices is *self-dual* [25]. The equivalence functor **JSL**<sup>f</sup> −→ **JSL**op <sup>f</sup> sends a semilattice *S* to its *dual semilattice S*op obtained by reversing the order, and a morphism *<sup>f</sup>* : *<sup>S</sup>* <sup>→</sup> *<sup>T</sup>* to the morphism *<sup>f</sup>* <sup>∗</sup> : *<sup>T</sup>*op <sup>→</sup> *<sup>S</sup>*op mapping *t* ∈ *T* to the ≤*<sup>S</sup>*-largest element *s* ∈ *S* with *f*(*s*) ≤*<sup>T</sup> t*. Note that *f* is *adjoint* to *f* <sup>∗</sup>: for *s* ∈ *S* and *t* ∈ *T* we have *f*(*s*) ≤*<sup>T</sup> t* iff *s* ≤*<sup>S</sup> f* <sup>∗</sup>(*t*).

*Languages.* A *language* is a subset *L* of *Σ*∗, the set of finite words over an alphabet *<sup>Σ</sup>*. We let *<sup>L</sup>* <sup>=</sup> *<sup>Σ</sup>*<sup>∗</sup> \ *<sup>L</sup>* denote the *complement* and *<sup>L</sup>*<sup>r</sup> <sup>=</sup> {*w*<sup>r</sup> : *<sup>w</sup>* <sup>∈</sup> *<sup>L</sup>*} the *reverse*, where *w*<sup>r</sup> = *a<sup>n</sup> ...a*<sup>1</sup> for *w* = *a*<sup>1</sup> *...an*. The *left derivatives*, *right derivatives* and *two-sided derivatives* of *<sup>L</sup>* are, respectively, given by *<sup>u</sup>*−<sup>1</sup>*<sup>L</sup>* <sup>=</sup> {*<sup>w</sup>* <sup>∈</sup> *<sup>Σ</sup>*<sup>∗</sup> : *uw* <sup>∈</sup> *<sup>L</sup>*}, *Lv*−<sup>1</sup> <sup>=</sup> {*<sup>w</sup>* <sup>∈</sup> *<sup>Σ</sup>*<sup>∗</sup> : *wv* <sup>∈</sup> *<sup>L</sup>*} and *<sup>u</sup>*−<sup>1</sup>*Lv*−<sup>1</sup> <sup>=</sup> {*<sup>w</sup>* <sup>∈</sup> *<sup>Σ</sup>*<sup>∗</sup> : *uwv* <sup>∈</sup> *<sup>L</sup>*} for *u, v* <sup>∈</sup> *<sup>Σ</sup>*∗. More generally, for *<sup>U</sup>* <sup>⊆</sup> *<sup>Σ</sup>*<sup>∗</sup> the language *<sup>U</sup>* <sup>−</sup><sup>1</sup>*<sup>L</sup>* <sup>=</sup> *<sup>u</sup>*∈*<sup>U</sup> <sup>u</sup>*−<sup>1</sup>*<sup>L</sup>* is called the *left quotient* of *L* w.r.t. *U*. We define the following sets of languages generated by *L*:


In other words, SLD(*L*) is the ∪-semilattice of all left quotients of *L*, or equivalently, the ∪-subsemilattice of P(*Σ*∗) generated by all left derivatives. Moreover, BLD(*L*) and BLRD(*L*) form the boolean subalgebras of P(*Σ*∗) generated by all left derivatives and all two-sided derivatives, respectively.

### **3 Duality Theory of Semilattice Automata**

In this section, we set up the algebraic framework in which nondeterministic automata can be studied. Since it involves considering several different types of automata, it is convenient to view them all as instances of a general categorical concept. For the rest of this paper, let *Σ* denote a fixed finite input alphabet.

**Definition 3.1.** Let C be a category and let *X, Y* ∈ C be two fixed objects. An *automaton* in C is a quadruple (*S, δ, i, f*) consisting of an object *S* ∈ C of *states*, a family *δ* = (*δ<sup>a</sup>* : *S* → *S*)*<sup>a</sup>*∈*<sup>Σ</sup>* of morphisms representing *transitions*, and two morphisms *i*: *X* → *S* and *f* : *S* → *Y* representing *initial* and *final* states (see the left-hand diagram below). A *morphism* between automata (*S, δ, i, f*) and (*S*- *, δ*- *, i*- *, f*- ) is given by a morphism *h*: *S* → *S*in C preserving transitions, initial

states and final states, i.e. making the right-hand diagram below commute for all *a* ∈ *Σ*:

Let **Aut**(C ) denote the category of automata in C and their morphisms.

**Notation 3.2.** We put *δ<sup>w</sup>* := *δ<sup>a</sup><sup>n</sup>* ◦···◦ *δ<sup>a</sup>*<sup>1</sup> for *w* = *a*<sup>1</sup> *...a<sup>n</sup>* in *Σ*∗.

**Example 3.3.** (1) An automaton *D* = (*S, δ, i, f*) in **Set**, the category of sets and functions, with *X* = 1 and *Y* = 2, is precisely a classical *deterministic automaton*. It is called a *dfa* if *S* is finite. We identify the map *i*: 1 → *S* with an initial state *<sup>s</sup>*<sup>0</sup> <sup>=</sup> *<sup>i</sup>*(∗) <sup>∈</sup> *<sup>S</sup>*, and the map *<sup>f</sup>* : *<sup>S</sup>* <sup>→</sup> <sup>2</sup> with a set *<sup>F</sup>* <sup>=</sup> *<sup>f</sup>* <sup>−</sup><sup>1</sup>[1] <sup>⊆</sup> *<sup>S</sup>* of final states. The language *L*(*D, s*) *accepted* by a state *s* ∈ *S* is the set of all words *w* ∈ *Σ*<sup>∗</sup> such that *δw*(*s*) ∈ *F*. The language *L*(*D*) *accepted* by *D* is the language accepted by the state *s*0.

(2) An automaton *N* = (*S, δ, i, f*) in **Rel**, the category of sets and relations, with *X* = *Y* = 1, is precisely a classical *nondeterministic automaton*. It is called an *nfa* if *S* is finite. We identify *i* ⊆ 1 × *S* with a set *I* ⊆ *S* of initial states and *f* ⊆ *S* × 1 with a set *F* ⊆ *S* of final states. Thus, in our view an nfa may have multiple initial states. The language *L*(*N,R*) *accepted* by a subset *R* ⊆ *S* consists of all *w* ∈ *Σ*<sup>∗</sup> such that (*r, s*) ∈ *δ<sup>w</sup>* for some *r* ∈ *R* and *s* ∈ *F*. The language *L*(*N*) *accepted* by *N* is the language accepted by the set *I*.

(3) An automaton *A* = (*S, δ, i, f*) in **JSL** with *X* = *Y* = 2, shortly a **JSL***automaton*, is given by a semilattice *S* of states, a family *δ* = (*δ<sup>a</sup>* : *S* → *S*)*<sup>a</sup>*∈*<sup>Σ</sup>* of semilattice morphisms specifying transitions, an initial state *s*<sup>0</sup> ∈ *S* (corresponding to *i*: 2 → *S*), and a prime filter *F* ⊆ *S* of final states (corresponding to *<sup>f</sup>* : *<sup>S</sup>* <sup>→</sup> <sup>2</sup>). It is called a **JSL***-dfa* if *<sup>S</sup>* is finite. The language *accepted* by a state *s* ∈ *S* or by the automaton *A*, resp., is defined as for deterministic automata.

**Remark 3.4 (JSL-dfas vs. nfas).** Dfas, nfas and **JSL**-dfas are expressively equivalent; they all accept precisely the regular languages. The interest of **JSL**dfas is that they constitute an algebraic representation of nfas:

(1) Every **JSL**-dfa *A* = (*S, δ, s*0*, F*) induces an equivalent nfa *J*(*A*) on the set *J*(*S*) of join-irreducibles of *S*. Given *s, t* ∈ *J*(*S*) and *a* ∈ *Σ*, there is a transition *s a* −→ *t* in *J*(*A*) iff *t* ≤ *δa*(*s*); the initial states are those *s* ∈ *J*(*S*) with *s* ≤ *s*0, and the final states form the set *J*(*S*) ∩ *F*.

(2) Conversely, for every nfa *N* = (*Q, δ, I, F*), the *subset construction* yields an equivalent **JSL**-dfa <sup>P</sup>(*N*) with states <sup>P</sup>(*Q*) (the <sup>∪</sup>-semilattice of subsets of *<sup>Q</sup>*), transitions P*δ<sup>a</sup>* : P(*Q*) → P(*Q*), *X* → *δa*[*X*], initial state *I* ∈ P(*Q*), and final states those subsets of *Q* containing some state from *F*. Note that *J*(P(*Q*)) ∼= *Q*.

It follows that the task of finding a state-minimal nfa for a given language is equivalent to finding a **JSL**-dfa with a minimum number of join-irreducibles [4]. This idea has recently been extended to a general coalgebraic framework [32,39].

Recall that the *minimal dfa* [7] for a regular language *L*, denoted by dfa(*L*), has states LD(*L*) (the set of left derivatives of *<sup>L</sup>*), transitions *<sup>K</sup> <sup>a</sup>* −→ *<sup>a</sup>*−<sup>1</sup>*<sup>K</sup>* for *<sup>K</sup>* <sup>∈</sup> LD(*L*) and *<sup>a</sup>* <sup>∈</sup> *<sup>Σ</sup>*, initial state *<sup>L</sup>* <sup>=</sup> *<sup>ε</sup>*−<sup>1</sup>*L*, and final states those *<sup>K</sup>* <sup>∈</sup> LD(*L*) containing *ε*. Up to isomorphism, it can be characterized as the unique dfa accepting *L* that is *reachable* (i.e. every state is reachable from the initial state via transitions) and *simple* (i.e. any two distinct states accept distinct languages). We now develop the analogous concepts for **JSL**-automata; they are instances of the categorical theory of minimality due to Arbib and Manes [3] and Goguen [15]. Let us first observe that every language has two canonical infinite **JSL**-acceptors:

### **Definition 3.5.** Let *L* ⊆ *Σ*<sup>∗</sup> be a language.

(1) The *initial* **JSL***-automaton* Init(*L*) for *<sup>L</sup>* has states <sup>P</sup>f(*Σ*∗) (the <sup>∪</sup>-semilattice of finite subsets of *Σ*∗), initial state {*ε*}, final states all *X* ∈ Pf(*Σ*∗) with *X* ∩ *L* = ∅, and transitions *X* → *Xa* = {*xa* : *x* ∈ *X*} for *X* ∈ Pf(*Σ*∗) and *a* ∈ *Σ*.

(2) The *final* **JSL***-automaton* Fin(*L*) for *<sup>L</sup>* has states <sup>P</sup>(*Σ*∗) (the <sup>∪</sup>-semilattice of all languages), initial state *L*, final states all languages *K* containing *ε*, and transitions *<sup>K</sup>* → *<sup>a</sup>*−<sup>1</sup>*<sup>K</sup>* for *<sup>K</sup>* ∈ P(*Σ*∗) and *<sup>a</sup>* <sup>∈</sup> *<sup>Σ</sup>*.

As suggested by the terminology, these automata form the initial and the final object in the category of **JSL**-automata accepting *L*:

**Lemma 3.6 [3, 15].** *For every* **JSL***-automaton A* = (*S, δ, s*0*, F*) *accepting the language <sup>L</sup>* <sup>⊆</sup> *<sup>Σ</sup>*∗*, there exist unique* **JSL***-automata morphisms*

*e<sup>A</sup>* : Init(*L*) → *A and m<sup>A</sup>* : *A* → Fin(*L*)*.*

*The map <sup>e</sup><sup>A</sup> sends* {*w*1*,...,w<sup>n</sup>*}∈Pf(*Σ*∗) *to the state <sup>n</sup> <sup>i</sup>*=1 *δ<sup>w</sup><sup>i</sup>* (*s*0)*, and the map m<sup>A</sup> sends a state s* ∈ *S to L*(*A, s*)*, the language accepted by s.*

**Definition 3.7.** A **JSL**-automaton *A* = (*S, δ, s*0*, F*) is called

(1) *reachable* if the unique morphism *e<sup>A</sup>* : Init(*L*) → *A* is surjective, i.e. every state is of the form *<sup>n</sup> <sup>i</sup>*=1 *δ<sup>w</sup><sup>i</sup>* (*s*0) for some *w*1*,...,w<sup>n</sup>* ∈ *Σ*∗;

(2) *simple* if the unique morphism *m<sup>A</sup>* : *A* → Fin(*L*) in injective, i.e. any two distinct states accept distinct languages;

(3) *minimal* if it is both reachable and simple.

**Remark 3.8.** (1) The category **Aut**(**JSL**) has a factorization system given by surjective and injective morphisms. Thus, for every **JSL**-automata morphism *h*: (*S, δ, i, f*) → (*S*- *, δ*- *, i*- *, f*- ) with image factorization *<sup>h</sup>* = (*<sup>S</sup> <sup>e</sup>* -- *S*-- *m* - *S*-) in **JSL**, there exists a unique **JSL**-automaton structure (*S*--*, δ*--*, i*--*, f*--) on *S*-- making both *e* and *m* automata morphisms. We call *e* the *coimage* and *m* the *image* of *h*. *Subautomata* and *quotient automata* of **JSL**-automata are represented by injective and surjective morphisms, respectively.

(2) Every **JSL**-automaton *A* has a unique reachable subautomaton reach(*A*) *A*, the *reachable part* of *A*. It is the smallest subautomaton of *A* and arises as the image of the unique morphism *e<sup>A</sup>* : Init(*L*) → *A*. Thus,

*A* is reachable iff *A* ∼= reach(*A*) iff *A* has no proper subautomaton*.*

Let us emphasize that a state in reach(*A*) is not necessarily reachable when *A* is viewed as an ordinary dfa. For distinction, we thus call a state **JSL***-reachable* if it lies in reach(*A*), and *dfa-reachable* if it is reachable in the usual sense.

(3) Dually, every **JSL**-automaton *A* has a unique simple quotient automaton *A* simple(*A*), the *simplification* of *A*. It is the smallest quotient automaton of *A* and arises as the coimage of the unique morphism *m<sup>A</sup>* : *A* → Fin(*L*). Thus,

*A* is simple iff *A* ∼= simple(*A*) iff *A* has no proper quotient automaton*.*

(4) Every language *<sup>L</sup>* <sup>⊆</sup> *<sup>Σ</sup>*<sup>∗</sup> has a minimal **JSL**-automaton, unique up to isomorphism. It can be constructed as the image of the unique automata morphism *h<sup>L</sup>* : Init(*L*) → Fin(*L*). Since *h<sup>L</sup>* sends {*w*1*,...,w<sup>n</sup>*}∈Pf(*Σ*∗) to the language *n <sup>i</sup>*=1 *w*−<sup>1</sup> *<sup>i</sup> L*, the minimal automaton of *L* is the subautomaton SLD(*L*) of Fin(*L*) carried by the semilattice of finite unions of left derivatives of *L*.

**Example <sup>3</sup>.9.** The minimal **JSL**-dfa accepting *<sup>L</sup>* <sup>=</sup> {*a, aa*} is shown below, with the dashed lines representing the partial order.

**Remark 3.10.** The self-duality of **JSL**<sup>f</sup> lifts to a self-duality of the category of **JSL**-dfas. The equivalence functor **Aut**(**JSL**f) −→ **Aut**(**JSL**f)op maps a **JSL**-dfa *A* = (*S,* (*δ<sup>a</sup>* : *S* → *S*)*<sup>a</sup>*∈*<sup>Σ</sup>, i*: 2 → *S, f* : *S* → 2) to its *dual automaton*

$$A^{\textsf{op}} = (S^{\textsf{op}}, (\delta\_a^\* \colon S^{\textsf{op}} \to S^{\textsf{op}})\_{a \in \Sigma}, \, f^\* \colon 2 \to S^{\textsf{op}}, \, i^\* \colon S^{\textsf{op}} \to 2),$$

using that <sup>2</sup>op <sup>∼</sup><sup>=</sup> <sup>2</sup>. Thus, the initial state of *<sup>A</sup>*op is the <sup>≤</sup>*<sup>S</sup>*-largest non-final state of *A*, and its final states are those *s* ∈ *S* with *s*<sup>0</sup> ≤*<sup>S</sup> s*. Given *s, t* ∈ *S* and *a* ∈ *Σ*, there is a transition *<sup>s</sup> <sup>a</sup>* −→ *<sup>t</sup>* in *<sup>A</sup>*op iff *<sup>t</sup>* is the <sup>≤</sup>*<sup>S</sup>*-largest state with *<sup>δ</sup>a*(*t*) <sup>≤</sup>*<sup>S</sup> <sup>s</sup>*.

The dualization of **JSL**-dfas can be seen as an algebraic generalization of the reversal operation on nfas. Recall that the *reverse* of an nfa *N* is the nfa *N*<sup>r</sup> obtained by flipping all transitions and swapping initial and final states. If *N* accepts the language *L*, then *N*<sup>r</sup> accepts the reverse language *L*<sup>r</sup> .

**Lemma 3.11.** *For each nfa N* = (*Q, δ, I, F*)*, we have the* **JSL***-dfa isomorphism*

$$[\mathcal{P}(N)]^{\text{op}} \xrightarrow{\cong} \mathcal{P}(N^{\prime}), \qquad X \mapsto \overline{X} = Q \backslash X.$$

The following lemma summarizes some important properties of *A*op:

**Lemma 3.12.** *Let A* = (*S, δ, i, f*) *be a* **JSL***-dfa.*

(1) *For every <sup>s</sup>* <sup>∈</sup> *<sup>S</sup>, we have <sup>L</sup>*(*A*op*, s*) = { *<sup>w</sup>* <sup>∈</sup> *<sup>Σ</sup>*<sup>∗</sup> : *<sup>δ</sup><sup>w</sup>*r(*s*0) ≤*<sup>S</sup> <sup>s</sup>* }*.*


Our next goal is to give, for every regular language *L*, dual characterizations of SLD(*L*), BLD(*L*) and BLRD(*L*), the **JSL**-subautomata of Fin(*L*) carried by all finite unions of left derivatives, boolean combinations of left derivatives and boolean combinations of two-sided derivatives, respectively. These results form the core of our duality-based approach to (sub-)atomic nfas in the next section. The minimal **JSL**-dfa SLD(*L*) admits the following dual description:

**Proposition 3.13.** *For every regular language L, the minimal* **JSL***-dfas for L and L*<sup>r</sup> *are dual. More precisely, we have the* **JSL***-dfa isomorphism*

> dr*<sup>L</sup>* : [SLD(*L*<sup>r</sup> )]op <sup>∼</sup><sup>=</sup> −→ SLD(*L*)*, K* → (*K*<sup>r</sup>) <sup>−</sup><sup>1</sup>*L.*

**Remark 3.14.** (1) The isomorphism dr*<sup>L</sup>* induces a bijection between the *left* and *right factors* of *L*, i.e. the inclusion-maximal left/right solutions of *X* ·*Y* ⊆ *L*. Conway [10] observed that the left and right factors are respectively {*K*<sup>r</sup> : *K* ∈ SLD(*L*<sup>r</sup> )} and {*K* : *K* ∈ SLD(*L*)} and that they biject. Backhouse [5] observed that they are dually isomorphic posets. Proposition 3.13 provides an explicit automata-theoretic lattice isomorphism arising canonically via duality.

(2) The isomorphism dr*<sup>L</sup>* is tightly connected to the *dependency relation* [18,20] of a regular language *L*, i.e. the binary relation given by

$$\mathcal{D}\mathcal{R}\_L \subseteq \mathsf{LD}(L) \times \mathsf{LD}(L^r), \qquad \mathcal{D}\mathcal{R}\_L(u^{-1}L, v^{-1}L^r) : \Longleftrightarrow \ uv^r \in L.$$

Its restriction DR*<sup>j</sup> <sup>L</sup>* := DR*<sup>L</sup>* <sup>∩</sup> *<sup>J</sup>*(SLD(*L*)) <sup>×</sup> *<sup>J</sup>*(SLD(*L<sup>r</sup>*)) to the <sup>∪</sup>-irreducible left derivatives of *L* and *L*<sup>r</sup> is called the *reduced dependency relation*. The following theorem shows that the semilattice of left quotients and the dependency relation are essentially the same concepts. In part (3), we use that the isomorphism dr*<sup>L</sup>* restricts to a bijection between the <sup>∪</sup>-irreducible derivatives of *<sup>L</sup>*<sup>r</sup> and the meet-irreducible elements of the lattice SLD(*L*).

#### **Theorem 3.15 (Dependency theorem).**

(1) *We have the* **JSL***-isomorphism*

$$\mathsf{SLD}(L) \xrightarrow{\cong} (\{\mathcal{DRR}\_L[X] : X \subseteq \mathsf{LD}(L)\}, \cup, \emptyset), \qquad K \mapsto \{v^{-1}L' : v \in K'\}.$$

*Note that its codomain forms a subsemilattice of* <sup>P</sup>(LD(*L*<sup>r</sup> ))*.*

(2) *For all u, v* <sup>∈</sup> *<sup>Σ</sup>*<sup>∗</sup> *we have* DR*<sup>L</sup>*(*u*−<sup>1</sup>*L, v*−<sup>1</sup>*L<sup>r</sup>*) ⇐⇒ *<sup>u</sup>*−<sup>1</sup>*<sup>L</sup>* dr*L*(*v*−<sup>1</sup>*L<sup>r</sup>*)*.*

(3) *The following diagram in* **Rel** *commutes:*

$$\begin{array}{c} J(\mathsf{SLD}(L^{\mathsf{r}})) \xrightarrow{\operatorname{dr}\_{L}} M(\mathsf{SLD}(L)) \\ \mathcal{D}\mathcal{R}\_{L}^{j} \uparrow \\ J(\mathsf{SLD}(L)) \xrightarrow{\prod} J(\mathsf{SLD}(L)) \end{array}$$

Let us now turn to a dual characterization of the **JSL**-dfa BLD(*L*):

**Proposition 3.16.** *For every regular language L, the* **JSL***-dfa* BLD(*L*) *is dual to the subset construction of the minimal dfa for L*<sup>r</sup> *:*

$$[\mathsf{BLD}(L)]^{\mathsf{op}} \cong \mathcal{P}(\mathsf{dfa}(L^r)).$$

*The isomorphism maps* {*w*−<sup>1</sup> <sup>1</sup> *L*<sup>r</sup> *,...,w*−<sup>1</sup> *<sup>n</sup> L*<sup>r</sup> }∈P(dfa(*L*<sup>r</sup> )) *to <sup>n</sup> <sup>i</sup>*=1 At(*w*<sup>r</sup> *<sup>i</sup>*)*, where* At(*x*) *is the unique atom (= join-irreducible) of* BLD(*L*) *containing x.*

To state the dual characterization of BLRD(*L*), we recall two standard concepts from algebraic language theory [33]. The *transition monoid* of a deterministic automaton *<sup>D</sup>* = (*S, δ, i, f*) is the image tm(*D*) <sup>⊆</sup> **Set**(*S, S*) of the morphism

$$
\Sigma^\* \to \mathbf{Set}(S, S), \quad w \mapsto \delta\_w.
$$

Thus, tm(*M*) is carried by the set of extended transition maps *δ<sup>w</sup>* (*w* ∈ *Σ*∗) with multiplication given by *<sup>δ</sup><sup>v</sup>* • *<sup>δ</sup><sup>w</sup>* <sup>=</sup> *<sup>δ</sup>vw* and unit *id<sup>S</sup>* <sup>=</sup> *<sup>δ</sup><sup>ε</sup>* : *<sup>S</sup>* <sup>→</sup> *<sup>S</sup>*. We may view tm(*D*) as a deterministic automaton with initial state *idS*, final states all *δ<sup>w</sup>* such that *w* is accepted by *D*, and transitions *δ<sup>w</sup> a* −→ *δwa* for *w* ∈ *Σ*<sup>∗</sup> and *a* ∈ *Σ*. This automaton accepts the same language as *D*. The *syntactic monoid* syn(*L*) of a regular language *L* ⊆ *Σ*<sup>∗</sup> is the transition monoid of its minimal dfa:

$$\mathfrak{syn}(L) = \mathfrak{tm}(\mathsf{dfa}(L)).$$

Equivalently, syn(*L*) is the quotient monoid of the free monoid *Σ*<sup>∗</sup> modulo the *syntactic congruence* of *L*, i.e the monoid congruence on *Σ*<sup>∗</sup> given by

$$w \equiv\_L w \quad \text{iff} \quad \forall x, y \in \Sigma^\* : xvy \in L \iff xwy \in L.$$

The associated surjective monoid morphism *μ<sup>L</sup>* : *Σ*<sup>∗</sup> syn(*L*), mapping *w* ∈ *Σ*<sup>∗</sup> to its congruence class [*w*]*<sup>L</sup>* ∈ syn(*L*), is called the *syntactic morphism*.

**Proposition 3.17.** *For every regular language L, the* **JSL***-dfa* BLRD(*L*) *is dual to the subset construction of* syn(*L*<sup>r</sup> )*, viewed as a dfa:*

$$[\mathsf{BLRD}(L)]^{\mathsf{op}} \cong \mathcal{P}(\mathsf{syn}(L^{\mathsf{r}})).$$

*The isomorphism maps* { [*w*1]*<sup>L</sup>*<sup>r</sup> *,...,* [*wn*]*<sup>L</sup>*<sup>r</sup> }∈P(syn(*L*<sup>r</sup> )) *to <sup>n</sup> <sup>i</sup>*=1 At(*w<sup>i</sup>* <sup>r</sup>)*, with* At(*x*) *denoting the unique atom of* BLRD(*L*) *containing x.*

Our final duality result in this section concerns the *transition semiring* [35], a generalization of the transition monoid to **JSL**-automata. Note that the monoid **JSL**(*S, S*) of endomorphisms of a semilattice *S* forms an idempotent semiring with join defined pointwise: for any *f,g* : *S* → *S*, the morphism *f* ∨ *g* : *S* → *S* is given by *<sup>s</sup>* → *<sup>f</sup>*(*s*) <sup>∨</sup> *<sup>g</sup>*(*s*). The transition semiring of a **JSL**-automaton *<sup>A</sup>* = (*S, δ, i, f*) is the image ts(*A*) <sup>⊆</sup> **JSL**(*S, S*) of the semiring morphism

$$\mathcal{P}\_{\mathbf{f}}(\boldsymbol{\Sigma}^\*) \to \mathbf{JSL}(S, S), \quad \{w\_1, \dots, w\_n\} \mapsto \bigvee\_{i=1}^n \delta\_{w\_i}.$$

Here Pf(*Σ*∗) is the free idempotent semiring on *Σ*, with composition given by concatenation of languages and join given by union. Thus, ts(*A*) is the semiring carried by all morphisms *<sup>n</sup> <sup>i</sup>*=1 *δ<sup>w</sup><sup>i</sup>* for *w*1*,...,w<sup>n</sup>* ∈ *Σ*∗, with join given as above and multiplication *<sup>j</sup> δ<sup>v</sup><sup>j</sup>* • *<sup>i</sup> δ<sup>w</sup><sup>i</sup>* = *i,j δ<sup>v</sup>jw<sup>i</sup>* . We view ts(*A*) as a **JSL**-automaton with initial state *id<sup>S</sup>* = *δε*, final states all *<sup>i</sup> δ<sup>w</sup><sup>i</sup>* such that some *w<sup>i</sup>* is accepted by *A*, and transitions *<sup>n</sup> <sup>i</sup>*=1 *δ<sup>w</sup><sup>i</sup> a*−−→ *<sup>n</sup> <sup>i</sup>*=1 *δ<sup>w</sup>i<sup>a</sup>* for *w*1*,...,w<sup>n</sup>* ∈ *Σ*<sup>∗</sup> and *<sup>a</sup>* <sup>∈</sup> *<sup>Σ</sup>*. This **JSL**-automaton is reachable and accepts the same language as *A*. It has the following dual characterization:

**Notation 3.18.** Given a simple **JSL**-automaton *A* = (*S, δ, i, f*), the subautomaton of Fin(*L*) obtained by closing *S* (viewed as a set of languages) under right derivatives is called the *right-derivative closure* of *A* and denoted rdc(*A*).

**Proposition 3.19.** *Let A be a reachable* **JSL***-dfa. Then the transition semiring of A, viewed as a* **JSL***-dfa, is dual to the right-derivative closure of A*op*:*

$$[\mathsf{ts}(A)]^{\mathsf{op}} \cong \mathsf{rdc}(A^{\mathsf{op}}).$$

Note that both [ts(*A*)]op and rdc(*A*op) are simple, hence subautomata of Fin(*L*). Thus, the isomorphism just expresses that their states accept the same languages.

### **4 Boolean Representations and Subatomic NFAs**

Based upon the duality results of the previous section, we will now introduce our algebraic approach to nondeterministic state minimality. It rests on the concept of a representation of a monoid on a finite semilattice.

**Definition 4.1 (Boolean representation).** Let *M* be a monoid.

(1) A *boolean representation* of *M* is given by a finite semilattice *S* together with a monoid morphism *<sup>ρ</sup>*: *<sup>M</sup>* <sup>→</sup> **JSL**(*S, S*). The *degree* of *<sup>ρ</sup>* is

$$\deg(\rho) := |J(S)|.$$

(2) Given boolean representations *<sup>ρ</sup><sup>i</sup>* : *<sup>M</sup>* <sup>→</sup> **JSL**(*Si, Si*), *<sup>i</sup>* = 1*,* <sup>2</sup>, an *equivariant map <sup>f</sup>* : *<sup>ρ</sup>*<sup>1</sup> <sup>→</sup> *<sup>ρ</sup>*<sup>2</sup> is a **JSL**-morphism *<sup>f</sup>* : *<sup>S</sup>*<sup>1</sup> <sup>→</sup> *<sup>S</sup>*<sup>2</sup> such that

$$f(\rho\_1(m)(s)) = \rho\_2(m)(f(s)) \text{ for all } m \in M \text{ and } s \in S\_1.$$

If *f* is injective, we say that the representation *ρ*<sup>2</sup> *extends ρ*1.

**Remark 4.2.** (1) The above representations are called *boolean* because semilattices are precisely semimodules over the boolean semiring 2 = {0*,* 1} with 1+1 = 1. For more on representations over general commutative semirings, see [21].

(2) The category of boolean representations of *M* coincides with the functor category **JSL***<sup>M</sup>* <sup>f</sup> , viewing *M* as a one object category.

**Definition 4.3 (Canonical representation).** For every regular language *L*, the *canonical boolean representation* of the syntactic monoid syn(*L*) is given by

$$\kappa\_L \colon \mathfrak{sw}(L) \to \mathbf{JSL}(\mathbb{SLD}(L), \mathbb{SLD}(L)), \quad [w]\_L \mapsto \lambda K.w^{-1}K.$$

It induces the *canonical boolean presentation* of the free monoid *Σ*<sup>∗</sup> given by

$$\kappa\_L \circ \mu\_L \colon \Sigma^\* \to \mathbf{JSL}(\mathbf{SLD}(L), \mathbf{SLD}(L)), \quad w \mapsto \lambda K. w^{-1}K,$$

where *μ<sup>L</sup>* : *Σ*<sup>∗</sup> syn(*L*) is the syntactic morphism.

The representation *κ<sup>L</sup>* ◦ *μ<sup>L</sup>* amounts to constructing the transition semiring of the minimal **JSL**-automaton SLD(*L*), i.e. the *syntactic semiring* [35] of *L*.

**Example 4.4.** We describe the canonical boolean representation *κ<sup>L</sup><sup>n</sup>* for the language *<sup>L</sup><sup>n</sup>* := (0 + 1)∗1(0 + 1)*<sup>n</sup>*, *<sup>n</sup>* <sup>∈</sup> <sup>N</sup>. Let *<sup>S</sup>* := 2*<sup>n</sup>*+1 <sup>⊥</sup> be the semilattice of binary words of length *n* + 1, ordered pointwise, with an additional bottom element ⊥. Then SLD(*Ln*) is isomorphic to *S*, as witnessed by the isomorphism

$$f \colon S \xrightarrow{\cong} \mathsf{SLD}(L\_n), \quad f(\bot) = \emptyset, \quad f(w) = w^{-1} L\_n.$$

Thus, *<sup>κ</sup><sup>L</sup><sup>n</sup>* is isomorphic to the representation *<sup>ρ</sup>*: syn(*Ln*) <sup>→</sup> **JSL**(*S, S*) where: (1) *ρ*([0]*<sup>L</sup><sup>n</sup>* ): *S* → *S* performs a left-shift (distinct from left-rotate); (2) *ρ*([1]*<sup>L</sup><sup>n</sup>* ): *S* → *S* performs a left-shift and sets the last bit as 1.

Finally, deg(*κ<sup>L</sup><sup>n</sup>* ) = deg(*ρ*)=1+ <sup>|</sup>*J*(2*<sup>n</sup>*+1)<sup>|</sup> <sup>=</sup> *<sup>n</sup>* + 2 is the number of states of the usual minimal nfa for *L*.

**Example 4.5.** We describe the canonical boolean presentation *κ<sup>L</sup>* for the language *L* = *a*1(*a*<sup>2</sup> +*a*3)+*a*2(*a*<sup>1</sup> +*a*3)+*a*3(*a*<sup>1</sup> +*a*2) over *Σ* = {*a*1*, a*2*, a*3}. Consider the ∪-semilattice *M*<sup>3</sup> = {∅*,* {*a*1*, a*2}*,* {*a*1*, a*3}*,* {*a*2*, a*3}*, Σ*}. Then SLD(*L*) is isomorphic to the product semilattice 2 × *M*<sup>3</sup> × 2 via the map

$$f \colon \mathsf{SLD}(L) \xrightarrow{\cong} 2 \times M\_3 \times 2, \quad f(X) = (X \cap \Sigma^2, X \cap \Sigma, X \cap \{\varepsilon\}).$$

Note that the first and third component is either ∅ or one other set, i.e. it may be identified with the elements of 2. For *i* = 1*,* 2*,* 3 we define the following semilattice morphisms:


Then *<sup>κ</sup><sup>L</sup>* is isomorphic to *<sup>ρ</sup>*: syn(*L*) <sup>→</sup> **JSL**(2 <sup>×</sup> *<sup>M</sup>*<sup>3</sup> <sup>×</sup> <sup>2</sup>*,* <sup>2</sup> <sup>×</sup> *<sup>M</sup>*<sup>3</sup> <sup>×</sup> 2) where

$$\rho([a\_i]\_L) = (2 \times M\_3 \times 2 \xrightarrow{\alpha\_i \times \beta\_i \times \gamma} M\_3 \times 2 \times 2 \xrightarrow{\delta} 2 \times M\_3 \times 2).$$

Thus, deg(*κL*) = deg(*ρ*)=1+3+1=5. An analogous description of *κ<sup>L</sup>* exists for any language *L* where each word has the same length.

The next theorem links minimal nfas and representations.

**Definition 4.6.** The *nondeterministic state complexity* ns(*L*) of a regular language *L* is the least number of states of any nfa accepting *L*.

**Theorem 4.7.** *For every regular language L, the nondeterministic state complexity* ns(*L*) *is the least degree of any boolean representation extending the canonical representation <sup>κ</sup><sup>L</sup>* ◦ *<sup>μ</sup><sup>L</sup>* : *<sup>Σ</sup>*<sup>∗</sup> <sup>→</sup> **JSL**(SLD(*L*)*,* SLD(*L*))*.*

#### *Proof (Sketch).*

(1) Given a *k*-state nfa *N* = (*Q, δ, I, F*) accepting *L*, consider the subsemilattice langs(*N*) = simple(P(*N*)) of P(*Σ*∗) on all languages accepted by subsets of *Q*. The embedding SLD(*L*) langs(*N*) yields an extension of *<sup>κ</sup><sup>L</sup>* ◦ *<sup>μ</sup>L*. Since the semilattice langs(*N*) is generated by the languages accepted by single states of *N*, this extension has degree at most *k*.

(2) Conversely, let *<sup>ρ</sup>*: *<sup>Σ</sup>*<sup>∗</sup> <sup>→</sup> **JSL**(*S, S*) be a boolean representation of degree *<sup>k</sup>* extending *<sup>κ</sup><sup>L</sup>* ◦ *<sup>μ</sup>L*, witnessed by an injective equivariant map *<sup>h</sup>*: SLD(*L*) *<sup>S</sup>*. One can equip *S* with a **JSL**-dfa structure making *h* an automata morphism. Since morphisms preserve accepted languages, it follows that *S* accepts *L*. Then the nfa of join-irreducibles of *S*, see Remark 3.4, is a *k*-state nfa accepting *L*.

As an application, let us return to the dependency relation DR*<sup>L</sup>* introduced in Remark 3.14(2). Recall that a *biclique* of a relation *R* ⊆ *X* × *Y* (viewed as a bipartite graph) is a subset of the form *X*- × *Y* - ⊆ *R*, where *X*- ⊆ *X* and *Y* - ⊆ *Y* . A *biclique cover* of *R* is a set C of bicliques with *R* = C . The *bipartite dimension* dim(*R*) is the least cardinality of any biclique cover of *R*.

**Theorem 4.8 (Gruber-Holzer [18]).** *For every regular language L, we have*

$$
\dim(\mathcal{D}\mathcal{R}\_L) \le \text{ns}(L).
$$

We give a new algebraic proof of this result based on boolean representations.

*Proof.* (1) The task of computing biclique covers is well-known to be equivalent to the *set basis* problem. Given a family *C* ⊆ P(*Y* ) of subsets of a finite set *Y* , a set basis for *C* is a family *B* ⊆ P(*Y* ) such that each element of *C* can be expressed as a union of elements of *B*. A relation *R* ⊆ *X* × *Y* has a biclique cover of size *k* iff the family *C<sup>R</sup>* = {*R*[*x*] : *x* ∈ *X*}⊆P(*Y* ) of neighborhoods of nodes in *X* has a set basis of size *k*.

(2) Given an instance *C* ⊆ P(*Y* ) of the set basis problem, consider the ∪ subsemilattice *C*⊆P(*Y* ) generated by *C*, i.e. the semilattice of all unions of sets in *C*. We claim that *C* has a set basis of size at most *k* iff there exists an extension of *C* of degree at most *<sup>k</sup>*, i.e. a monomorphism *C <sup>S</sup>* into some finite semilattice *S* with |*J*(*S*)| ≤ *k*.

For the "only if" direction, suppose that *B* ⊆ P(*Y* ) is a set basis of *C* of size at most *<sup>k</sup>*. The the embedding *C B* gives an extension of *C* with the desired property: since the semilattice *B* has a set of generators with at most *k* elements, it has at most *k* join-irreducibles.

For the "if" direction, suppose that *<sup>m</sup>*: *C <sup>S</sup>* with <sup>|</sup>*J*(*S*)| ≤ *<sup>k</sup>* is given. Since the free semilattice <sup>P</sup>(*<sup>Y</sup>* ) is an injective object of **JSL** [19, Corollary <sup>2</sup>.9], there exists a morphism *<sup>f</sup>* : *<sup>S</sup>* → P(*<sup>Y</sup>* ) extending the embedding *C* <sup>P</sup>(*<sup>Y</sup>* ). Consider the image *S*-⊆ P(*Y* ) of *f*, leading to the commutative diagram below:

We thus have *C* ⊆ *S*- ⊆ P(*Y* ). Every set of generators of the semilattice *S* is a basis of *C*. Since the morphism *e* is surjective, we have |*J*(*S*- )|≤|*J*(*S*)| ≤ *k*, i.e. *S*has a set of generators with at most *k* elements.

(3) Let *<sup>C</sup>*DR*<sup>L</sup>* ⊆ P(LD(*L*<sup>r</sup> )) be the instance of the set basis problem corresponding to the dependency relation DR*<sup>L</sup>* <sup>⊆</sup> LD(*L*) <sup>×</sup> LD(*L*<sup>r</sup> ). Note that *C*DR*<sup>L</sup>* consists of all DR*<sup>L</sup>*[*X*] for *X* ⊆ LD(*L*). Thus, Theorem 3.15(1) shows that *C*DR*<sup>L</sup>* ∼= SLD(*L*). In particular, every extension of the canonical boolean representation of *Σ*<sup>∗</sup> yields an extension of the semilattice *C*DR*<sup>L</sup>* of the same degree. Therefore, by part (1) and (2) and Theorem 4.7, we have dim(DR*<sup>L</sup>*) ≤ ns(*L*), as required.

Theorem 4.7 motivates the following definition, which can be considered the key concept of our paper:

**Definition 4.9.** The *nondeterministic syntactic complexity* n*μ*(*L*) of a regular language *L* is the least degree of any boolean representation of syn(*L*) extending the canonical boolean representation *<sup>κ</sup><sup>L</sup>* : syn(*L*) <sup>→</sup> **JSL**(SLD(*L*)*,* SLD(*L*)).

Just like the degrees of boolean representations of *Σ*<sup>∗</sup> determine the state complexity of nfas, we will provide an automata-theoretic characterization of n*μ*(*L*) in terms of *subatomic* nfas in Theorem 4.14 below.

**Definition 4.10.** An nfa accepting the language *L* is called


The notion of an atomic nfa goes back to Brzozowski and Tamm [6], as does the following characterization.

**Notation 4.11.** For any nfa *N*, let rsc(*N*) denote the dfa obtained via the *reachable subset construction*, i.e. the dfa-reachable part of P(*N*).

**Theorem 4.12.** *An nfa N is atomic iff* rsc(*N*<sup>r</sup> ) *is a minimal dfa.*

We present a new conceptual proof, interpreting this theorem as an instance of the self-duality of **JSL**-dfas.

*Proof (Sketch).* Let *L* be the language accepted by *N*. We establish the theorem by showing each of the following statements to be equivalent to the next one:


The key step is (2)⇔(3), which follows via duality from Lemmas 3.11 and 3.12, and Proposition 3.16. All remaining equivalences follow from the definitions.

The next theorem gives an analogous characterization of subatomic nfas. Again, the proof is based on duality.

**Theorem 4.13.** *An nfa N accepting the language L is subatomic iff the transition monoid of* rsc(*N*<sup>r</sup> ) *is isomorphic to the syntactic monoid* syn(*L*<sup>r</sup> )*.*

*Proof (Sketch).* Each of the following statements is equivalent to the next one:


The equivalence (3)⇔(4) follows via duality from Lemma 3.11, Proposition 3.17 and Proposition 3.19. All remaining equivalences follow from the definitions.

We are prepared to state the main result of our paper, an automata-theoretic characterization of the nondeterministic syntactic complexity:

**Theorem 4.14.** *For every regular language L, the nondeterministic syntactic complexity* n*μ*(*L*) *is the least number of states of any subatomic nfa accepting L.*

*Proof (Sketch).*

(1) Let *N* be a *k*-state subatomic nfa accepting the language *L*. As in the proof of Theorem 4.7, we consider the semilattice langs(*N*) = simple(P(*N*)). Then

$$\rho \colon \mathfrak{sym}(L) \to \mathbf{JSL}(\mathsf{langs}(N), \mathsf{langs}(N)), \quad [w]\_L \mapsto \lambda K.w^{-1}K,$$

is a representation of syn(*L*) of degree at most *k* extending *κL*.

(2) Conversely, let *<sup>ρ</sup>*: syn(*L*) <sup>→</sup> **JSL**(*S, S*) be a boolean representation extending *κL*, and let *h*: SLD(*Q*) *S* be the embedding. As in the proof of Theorem 4.7, we can equip *S* with the structure of a **JSL**-dfa making *h* an automata morphism. Its nfa of join-irreducibles, see Remark 3.4, is a subatomic nfa accepting *L* with deg(*ρ*) states.

We conclude this section with the observation that the state complexity of unrestricted nfas, subatomic nfas and atomic nfas generally differs:

**Example 4.15 (Subatomic more succinct than atomic).** Consider the language *L* accepted by the nfa *N* shown below, along with the minimal dfas for *L* and *L*<sup>r</sup> . Each automaton has exactly one initial state, namely 0.

Brzozowski and Tamm [6] showed that there is no atomic nfa with four states accepting *L*. However, *N* is subatomic: one can verify that the transition monoids of dfa(*L*<sup>r</sup> ) and rsc(*N*<sup>r</sup> ) both have 22 elements. Since the former is the syntactic monoid of *L*<sup>r</sup> , they are isomorphic, and so Theorem 4.13 applies.

**Example 4.16 (Subatomic less succinct than general nfas).** There is a regular language for which no state-minimal nfa is subatomic:

$$L := \{ a^n \; : \; n \in \mathbb{N}, \; n \neq 5 \} \subseteq \{ a \}^\*.$$

It is accepted by the following nfa:

An exhaustive search shows that no subatomic nfa with five states accepts *L*. In fact, *L* is the unique (!) unary language with ns(*L*) ≤ 5 and ns(*L*) *<* n*μ*(*L*). Moreover, the above nfa and its reverse are the only state-minimal nfas for *L*.

#### **5 Applications**

While subatomic nfas are generally less succinct then unrestricted ones, all structural results concerning nondeterministic state complexity we have encountered in the literature are actually about nondeterministic syntactic complexity: they implicitly identify classes of languages where the two measures coincide. In the present section, we illustrate this in a few selected applications.

#### **5.1 Unary languages**

For unary languages *L* ⊆ {*a*}<sup>∗</sup>, two-sided derivatives are left derivatives. Thus, a unary nfa is atomic iff it is subatomic.

**Example 5.1 (Cyclic unary languages).** A unary language *L* is *cyclic* if its minimal dfa is a cycle [16]. We claim that ns(*L*)=n*μ*(*L*). To see this, let *d* := |LD(*L*)| be the *period* (i.e. number of states) of the minimal dfa. By Fact 1 of [16] (originally from [22]) every state-minimal nfa *N* accepting *L* is a disjoint union of cyclic dfas whose periods divide *d*. <sup>1</sup> Then <sup>|</sup>rsc(*N*<sup>r</sup> )<sup>|</sup> <sup>=</sup> *<sup>d</sup>*: we have <sup>|</sup>rsc(*N*<sup>r</sup> )| ≥ *d* since rsc(*N*<sup>r</sup> ) is a dfa accepting *L* = *L*<sup>r</sup> and *d* is the size of the minimal dfa for *<sup>L</sup>*, and <sup>|</sup>rsc(*N*<sup>r</sup> )| ≤ *d* because after *d* steps, each cycle will be back in its initial state. Thus *N* is atomic by Theorem 4.12 and hence subatomic.

We deduce the following result for (not necessarily unary) regular languages:

**Theorem 5.2.** *If* syn(*L*) *is a cyclic group, then* ns(*L*)=n*μ*(*L*)*.*

*Proof (Sketch).* Suppose that syn(*L*) = tm(dfa(*L*)) is cyclic. Then there exists *<sup>w</sup>*<sup>0</sup> <sup>∈</sup> *<sup>Σ</sup>*<sup>∗</sup> such that the map *λX.w*−<sup>1</sup> <sup>0</sup> *X* : LD(*L*) → LD(*L*) generates tm(dfa(*L*)). Fix an alphabet *Σ*<sup>0</sup> = {*a*0} disjoint from *Σ* and consider the unary language

$$L\_0 := \{ a\_0^n \,:\, n \in \mathbb{N}, \, w\_0^n \in L \} \subseteq \Sigma\_0^\*.$$

Let *g* : *Σ*<sup>∗</sup> <sup>0</sup> → *Σ*<sup>∗</sup> be the monoid morphism where *g*(*a*0) := *w*0. Then we have the **JSL**-isomorphism

$$f \colon \mathsf{SLD}(L\_0) \xrightarrow{\cong} \mathsf{SLD}(L), \quad f(X^{-1}L\_0) := [g[X]]^{-1}L.$$

For each *<sup>a</sup>* <sup>∈</sup> *<sup>Σ</sup>* choose *<sup>n</sup><sup>a</sup>* <sup>∈</sup> <sup>N</sup> such that *<sup>a</sup>*−<sup>1</sup>*<sup>K</sup>* = (*w<sup>n</sup><sup>a</sup>* <sup>0</sup> )−<sup>1</sup>*<sup>K</sup>* for all *<sup>K</sup>* <sup>∈</sup> LD(*L*). The respective transition endomorphisms of the **JSL**-automata SLD(*L*0) and SLD(*L*) determine each other in the sense that the following diagrams commute:

$$\begin{array}{c} \mathsf{SLD}(L\_{0}) \xrightarrow{f} \begin{array}{c} \mathsf{SLD}(L\_{0}) \end{array} \xrightarrow{f} \mathsf{SLD}(L) \\ \downarrow^{a\_{0}^{-1}(-)} \downarrow \quad \downarrow^{w\_{0}^{-1}(-)} \qquad \downarrow^{w\_{0}^{-1}(-)} \xrightarrow{f} \downarrow^{a\_{1}^{w\_{0}}(-)} \downarrow^{a^{-1}(-)} \\ \mathsf{SLD}(L\_{0}) \xrightarrow{\cong} \begin{array}{c} \mathsf{SLD}(L\_{0}) \end{array} \xrightarrow{\cong} \mathsf{SLD}(L\_{0}) \end{array} \xrightarrow{\begin{array}{c} f \quad \mathsf{L}^{a^{-1}(-)}(-) \end{array}} \mathsf{SLD}(L) $$

Then ns(*L*) = ns(*L*0) by Theorem 4.7 and n*μ*(*L*)=n*μ*(*L*0) by Theorem 4.14. Moreover, by Example 5.1 we know that ns(*L*0)=n*μ*(*L*0), so the claim follows.

**Example 5.3 (**n*μ*(*L*) **no larger than Chrobak normal form).** A unary nfa is in *Chrobak normal form* [8, 13] if it has a single initial state and at most one state with multiple successors, all of which lie in disjoint cycles. We claim that for any nfa *N* in Chrobak normal form accepting the language *L*, we have

$$\text{n}\mu(L) \le |N|,$$

<sup>1</sup> In [16] nfas are restricted to have a single initial state and so are distinguished from unions of dfas; the latter are valid nfas from our perspective.

where |*N*| denotes the number of states of *N*. To see this, observe that each state of *N* up to and including the unique choice state accepts some left derivative of *L*. The successors of the choice state collectively accept a derivative *u*−<sup>1</sup>*L*; this language is cyclic because it is a finite union of cyclic languages. Therefore, by Example 5.1 we may replace the cycles by an atomic nfa accepting *u*−<sup>1</sup>*L*, without increasing the number of states. The resulting nfa is atomic.

Since every unary nfa on *n* states can be transformed into an nfa in Chrobak normal form with *O*(*n*<sup>2</sup>) states [8, Lemma 4.3], we get:

**Corollary 5.4.** *If L is a unary regular language, then* n*μ*(*L*) = *O*(ns(*L*)<sup>2</sup>)*.*

#### **5.2 Languages with a canonical state-minimal nfa**

There are several natural classes of regular languages for which *canonical* stateminimal nondeterministic acceptors have been identified. We show that these acceptors are actually subatomic. In our arguments, we frequently consider the *length* of a finite semilattice *S*, i.e. the maximum length *n* of any ascending chain *s*<sup>0</sup> *< s*<sup>1</sup> *<...< s<sup>n</sup>* in *S*. Note that since every element is uniquely determined by the set of join-irreducibles below it, the length of *S* is at most |*J*(*S*)|.

#### **Example 5.5 (Bideterministic and biseparable languages).**

(1) A language is called *bideterministic* if it is accepted by a dfa whose reverse is also a dfa. In this case, the minimal dfa is a minimal nfa [34,38]. Bideterministic languages have been studied in the context of automata learning [2] and coding theory, where they are known as *rectangular codes* [27, 36]. We show that for every bideterministic language *L*,

$$\operatorname{ns}(L) = \operatorname{n}\mu(L) = |\mathsf{LD}(L)|.$$

To this end, we first note that by [36, Theorem 3.1] a language *L* ⊆ *Σ*<sup>∗</sup> is bideterministic iff the left derivatives of *L* are pairwise disjoint. This implies that SLD(*L*) is a boolean algebra with atoms LD(*L*). Since the length of a boolean algebra equals the number of atoms (= join-irreducibles), we conclude that for every finite semilattice extension SLD(*L*) *S*, the semilattice *S* has length at least |LD(*L*)|. Thus, |LD(*L*)|≤|*J*(*S*)|, so any representation *ρ* extending *κ<sup>L</sup>* or *κ<sup>L</sup>* ◦ *μ<sup>L</sup>* satisfies |LD(*L*)| ≤ deg(*ρ*). Hence, ns(*L*)=n*μ*(*L*) = |LD(*L*)| by Theorem 4.7 and 4.14. In particular, the minimal dfa of *L* is a minimal nfa.

(2) A language *L* is *biseparable* if SLD(*L*) is a boolean algebra [28].<sup>2</sup> For every biseparable language *L*, the *canonical residual automaton* [12], i.e. the nfa *N<sup>L</sup>* of join-irreducibles of the minimal **JSL**-dfa SLD(*L*), is a state-minimal nfa; it is subatomic because every state of *N<sup>L</sup>* accepts a derivative of *L*. This follows exactly as in (1): our argument only used that SLD(*L*) is a boolean algebra.

<sup>2</sup> Actually [28] defines biseparability as a property of nfas, and characterizes biseparable nfas as those accepting a language *L* for which no ∪-irreducible left derivative is contained in the union of other ∪-irreducible left derivatives. This is equivalent to the lattice SLD(*L*) being boolean, i.e. to *L* being 'biseparable' in our sense.

**Example 5.6 (Maximal reachability).** A folklore result asserts that if *N* is an nfa whose accepted language *<sup>L</sup>* satisfies <sup>|</sup>LD(*L*)<sup>|</sup> = 2|*N*<sup>|</sup> , then *N* is stateminimal. Since LD(*L*) forms the set of states of the minimal dfa for *L* and rsc(*N*) accepts *<sup>L</sup>*, we have rsc(*N*) = <sup>P</sup>(*N*). It follows the **JSL**-dfa <sup>P</sup>(*N*) is reachable and simple, hence isomorphic to the minimal **JSL**-dfa SLD(*L*). This proves that SLD(*L*) is a boolean algebra, i.e. *L* is a biseparable language. We conclude from Example 5.5(2) that ns(*L*)=n*μ*(*L*) = |*N*| and *N<sup>L</sup>* is a subatomic minimal nfa.

**Example 5.7 (BiRFSA and topological languages).** So far SLD(*L*) has been a boolean algebra. But the argument in Example 5.5 also applies when SLD(*L*) is a distributive lattice, noting that the length of a finite distributive lattice is equal to the number of its join-irreducibles [17, Corollary 2.14]. Languages with this property are called *topological* [1]. It thus follows as in Example 5.5(2) that for any topological language *L*, the canonical residual automaton *N<sup>L</sup>* is subatomic and a state-minimal nfa. Thus, ns(*L*)=n*μ*(*L*) = |*J*(SLD(*L*))|.

There is another class of languages where *N<sup>L</sup>* is known to be a state-minimal nfa, the *biRFSA* languages [28]. A language *L* is called biRFSA if *N<sup>L</sup>* is isomorphic to (*N<sup>L</sup>*r) r . Surprisingly, these languages are exactly the topological ones:

(1) *Suppose that L is topological*. Recall that *N<sup>L</sup>* is the nfa of join-irreducibles of the minimal **JSL**-dfa. Thus, it has states *J*(SLD(*L*)) and transitions given by *<sup>X</sup> <sup>a</sup>* −→ *<sup>Y</sup>* iff *<sup>Y</sup>* <sup>⊆</sup> *<sup>a</sup>*−<sup>1</sup>*<sup>X</sup>* for *<sup>a</sup>* <sup>∈</sup> *<sup>Σ</sup>*. Moreover, a join-irreducible *<sup>j</sup>* is initial iff *<sup>j</sup>* <sup>⊆</sup> *<sup>L</sup>* and final iff *ε* ∈ *j*. Since the lattice SLD(*L*) is distributive, we have a canonical bijection between its join- and meet-irreducibles:

$$\tau \colon J(\mathsf{SLD}(L)) \xrightarrow{\cong} M(\mathsf{SLD}(L)), \quad \tau(j) = \bigcup \{ X \in \mathsf{SLD}(L) : j \not\subseteq X \}.$$

Let *θ* be the unique map making the following diagram commute, where dr*<sup>L</sup>* is the restriction of the isomorphism of Proposition 3.13:

$$\mathop{\mathop{\mathsf{L}}^{\theta}}\_{J(\mathsf{SLD}(L^{I}))} \overset{\mathop{\mathsf{s}}^{\theta}}{\mathop{\mathsf{s}}^{\tau}}\_{\overset{\cong}{\mathop{\mathsf{s}}}\_{\mathrm{dr}}} \overset{J(\mathsf{SLD}(L))}{\mathop{\mathsf{s}}^{\tau}}\_{M(\mathsf{SLD}(L))}$$

One can show *θ* to be an nfa isomorphism from *N<sup>L</sup>* to (*N<sup>L</sup>*r) r . Thus, *L* is biRFSA. (2) *Suppose that L is biRFSA.* Then we have a surjective **JSL**-morphism

$$[\mathcal{P}(J(\mathsf{SLD}(L)))]^{\mathsf{op}} \cong \mathcal{P}(J(\mathsf{SLD}(L^r))) \xrightarrow{e\_{L^r}} \mathsf{SLD}(L^r) \cong [\mathsf{SLD}(L)]^{\mathsf{op}},$$

where the first isomorphism follows from *N<sup>L</sup>* ∼= (*N<sup>L</sup>*r) <sup>r</sup> and Lemma 3.11, the second isomorphism is given by Proposition <sup>3</sup>.13, and *<sup>e</sup><sup>L</sup>*<sup>r</sup> sends *<sup>X</sup>* <sup>⊆</sup> *<sup>J</sup>*(SLD(*L*<sup>r</sup> )) to *X*. The dual of this morphism is the injective **JSL**-morphism

$$m\_L \colon \mathsf{SLD}(L) \longmapsto \mathcal{P}(J(\mathsf{SLD}(L)))$$

sending *K* ∈ SLD(*L*) to the set of all *j* ∈ *J*(SLD(*L*)) with *j* ⊆ *K*. Note that *<sup>e</sup><sup>L</sup>* ◦ *<sup>m</sup><sup>L</sup>* <sup>=</sup> *id*SLD(*Q*), showing that SLD(*L*) is a retract of <sup>P</sup>(*J*(SLD(*L*))). Since **JSL**-retracts of finite distributive lattices are distributive, see e.g. [31, Lemma 2.2.3.15], it follows that SLD(*L*) is distributive. Thus, *L* is topological.

**Example 5.8 (Extremal languages).** Call a language *extremal* if SLD(*L*) has length |*J*(SLD(*L*))| i.e. we have an *extremal lattice* in the sense of Markowsky [29]. Again, the argument of Example 5.5 applies and we get ns(*L*)=n*μ*(*L*) = |*J*(SLD(*L*))|. Topological languages are extremal since every distributive lattice is an extremal lattice, although extremal languages need not be topological. Both classes are naturally characterized in terms of the reduced dependency relation:

(1) *<sup>L</sup>* is topological iff DR*<sup>j</sup> <sup>L</sup>* is essentially an order relation ≤*<sup>P</sup>* ⊆ *P* × *P* of a finite poset [30, Example 2.2.12].

(2) *<sup>L</sup>* is extremal iff DR*<sup>j</sup> <sup>L</sup>* is *upper unitriangularizable* [29, Theorem 11].

The latter means the adjacency matrix of the bipartite graph DR*<sup>j</sup> <sup>L</sup>* can be put in upper triangular form with ones along the diagonal, by permuting rows and columns. An order relation is upper unitriangularizable because it may be extended to a linear order.

### **6 Conclusion and Future Work**

Motivated by the duality theory of deterministic finite automata over semilattices, we introduced a natural class of nondeterministic finite automata called *subatomic nfas* and studied their state complexity in terms of boolean representations of syntactic monoids. Furthermore, we demonstrated that a large body of previous work on state minimization of general nfas actually constructs minimal subatomic ones. There are several directions for future work.

As illustrated by Theorem 4.8, the dependency relation DR*<sup>L</sup>* forms a useful tool for proving lower bounds on nfas. It is also a key element of the Kameda-Weiner algorithm [26,37] for minimizing nfas, which rests on computing biclique covers of DR*<sup>L</sup>*. We aim to give an algebraic interpretation of dependency relations based on the representation of finite semilattices by contexts [24], which can be augmented to a categorical equivalence between **JSL**<sup>f</sup> and a suitable category of bipartite graphs [31]. Under this equivalence, **JSL**-dfas correspond to *dependency automata*; in particular, the minimal **JSL**-dfa SLD(*L*) corresponds to a dependency automaton whose underlying bipartite graph is precisely the dependency relation DR*<sup>L</sup>*. We expect that this observation can lead to a fresh algebraic perspective on the Kameda-Weiner algorithm, as well as a generalization of it computing minimal (sub-)atomic nfas.

On a related note, we also intend to investigate the complexity of the minimization problem for (sub-)atomic nfas. While minimizing general nfas is PSPACEcomplete, even if the input automaton is a dfa, we conjecture that the additional structure present in (sub-)atomic acceptors will simplify their minimization to an NP-complete task. First evidence in this direction is provided by Geldenhuys, van der Merve, and van Zijl [14] whose work implies that minimal atomic nfas can be efficiently computed in practice using SAT solvers.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **A String Diagrammatic Axiomatisation of Finite-State Automata**

Robin Piedeleu(-) and Fabio Zanasi

University College London, London, UK, {r.piedeleu, f.zanasi}@ucl.ac.uk

**Abstract.** We develop a fully diagrammatic approach to finite-state automata, based on reinterpreting their usual state-transition graphical representation as a two-dimensional syntax of string diagrams. In this setting, we are able to provide a complete equational theory for language equivalence, with two notable features. First, the proposed axiomatisation is finite— a result which is provably impossible for the one-dimensional syntax of regular expressions. Second, the Kleene star is a derived concept, as it can be decomposed into more primitive algebraic blocks.

**Keywords:** string diagrams · finite-state automata · symmetric monoidal category · complete axiomatisation

### **1 Introduction**

Finite-state automata are one of the most studied structures in theoretical computer science, with an illustrious history and roots reaching far beyond, in the work of biologists, psychologists, engineers and mathematicians. Kleene [25] introduced regular expressions to give finite-state automata an algebraic presentation, motivated by the study of (biological) neural networks [31]. They are the terms freely generated by the following grammar:

$$e, f ::= e + f \mid ef \mid e^\* \mid 0 \mid 1 \mid a \in A \tag{1}$$

Equational properties of regular expressions were studied by Conway [14] who introduced the term *Kleene algebra*: this is an idempotent semiring with an operation (−)<sup>∗</sup> for iteration, called the (Kleene) star. The equational theory of Kleene algebra is now well-understood, and multiple complete axiomatisations, both for language and relational models, have been given. Crucially, Kleene algebra is not finitely-based: no finite equational theory can appropriately capture the behaviour of the star [35]. Instead, there are purely equational infinitary axiomatisations [28,4] and Kozen's finitary implicational theory [26].

Since then, much research has been devoted to extending Kleene algebra with operations capturing richer patterns of behaviour, useful in program verification. Examples include conditional branching (Kleene algebra with tests [27], and its recent guarded version [37]), concurrent computation (CKA [19,23]), and specification of message-passing behaviour in networks (NetKAT [1]).

The meta-theory of the formalisms above essentially rests on the same three ingredients: (1) given an operational model (e.g., finite-state automata), (2) devise a syntax (regular expressions) that is sufficiently expressive to capture the class of behaviours of the operational model (regular languages), and (3) find a complete axiomatisation (Kleene algebra) for the given semantics.

In this paper, we open up a direct path from (1) to (3). Instead of thinking of automata as a combinatorial model, we formalise them as a bona-fide (twodimensional) syntax, using the well-established mathematical theory of *string diagrams* and monoidal categories [36]. This approach lets us axiomatise the behaviour of automata directly, freeing us from the necessity of compressing them down to a one-dimensional notation like regular expressions.

This perspective not only sheds new light on a venerable topic, but has significant consequences. First, as our most important contribution, we are able to provide a *finite and purely equational* axiomatisation of finite-state automata, up to language equivalence. Intriguingly, this does not contradict the impossibility of finding a finite basis for Kleene algebra, as the algebraic setting is different: our result gives a finite presentation as a symmetric monoidal category, while the impossibility result prevents any such presentation to exist as an algebraic theory (in the standard sense). In other words, there is no finite axiomatisation based on terms (*tree*-like structures), but we demonstrate that there is one based on string diagrams (*graph*-like structures).

Secondly, embracing the two-dimensional nature of automata guarantees a strong form of compositionality that the one-dimensional syntax of regular expressions does not have. In the string diagrammatic setting, automata may have multiple inputs and outputs and, as a result, can be decomposed into subcomponents that retain a meaningful interpretation. For example, if we split the automata below left, the resulting components are still valid string diagrams within our syntax, below right:

In line with the compositional approach, it is significant that the Kleene star can be decomposed into more elementary building blocks (which come together to form a feedback loop):

$$
\epsilon^\* \to \underbrace{\epsilon^\*}\_{\to} \rightharpoonup \epsilon^\* \rightharpoonup \epsilon^\*
$$

This opens up for interesting possibilities when studying extensions of Kleene algebra within the same approach— we elaborate on this in Section 6.

Finally, we believe our proof of completeness is of independent interest, as it relies on fully diagrammatic reformulation of Brzozowski's minimisation algorithm [12]. In the string diagrammatic setting, the symmetries of the equational theory give this procedure a particularly elegant and simple form. Because all of the axioms involved in the determinisation procedure come with a dual, a codeterminisation procedure can be defined immediately by simply reversing the former. This reduces the proof of completeness to a proof that determinisation can be performed diagrammatically.

We should also note that this is not the first time that automata and regular languages are recast into a categorical mould. The *iteration theories* [5] of Bloom and Ésik, *sharing graphs* [17] of Hasegawa or *network algebras* [39] of Stefanescu are all categorical frameworks designed to reason about iteration or recursion, that have found fruitful applications in this domain. They are based on a notion of parameterised fixed-point which defines a categorical *trace* in the sense of [22]. While our proposal bears resemblance to (and is inspired by) this prior work, it goes beyond in one fundamental aspect: it is the first to give a *finite* complete axiomatisation of automata up to language equivalence.

A second difference is methodological: our syntax (4) does not feature any primitive for iteration or recursion. In particular, the star is a derived concept, in the sense that it is decomposable into more elementary operations (3). Categorically, our starting point is a compact-closed rather than traced category.

We elaborate on the relation between ours and existing work in Section 6. Omitted proofs can be found in [33].

### **2 Syntax and semantics**

*Syntax.* We fix an alphabet Σ of letters *a* ∈ Σ. We call Aut<sup>Σ</sup> the symmetric strict monoidal category freely generated by the following objects and morphisms:


$$\begin{array}{c} \rightarrow \stackrel{\scriptstyle\rightarrow}{\rightarrow} \rightarrow \stackrel{\scriptstyle\rightarrow}{\rightarrow} \rightarrow \stackrel{\scriptstyle\rightarrow}{\rightarrow} \rightarrow \stackrel{\scriptstyle\rightarrow}{\rightarrow} \rightarrow \stackrel{\scriptstyle\rightarrow}{\rightarrow} \rightarrow \stackrel{\scriptstyle\rightarrow}{\rightarrow} \stackrel{\scriptstyle\rightarrow}{\rightarrow} \rightarrow \stackrel{\scriptstyle\rightarrow}{\rightarrow} \stackrel{\scriptstyle\rightarrow}{\rightarrow} \stackrel{\scriptstyle\rightarrow}{\rightarrow} \stackrel{\scriptstyle\rightarrow}{\rightarrow} \stackrel{\scriptstyle\rightarrow}{\rightarrow} \stackrel{\scriptstyle\rightarrow}{\rightarrow} \stackrel{\scriptstyle\rightarrow}{\rightarrow} \end{array} (4)$$

Freely generating Aut<sup>Σ</sup> from these data (usually called a *symmetric monoidal theory* [42,11]) means that morphisms of Aut<sup>Σ</sup> will be the string diagrams obtained by pasting together (by sequential composition and monoidal product in AutΣ) the basic components in (4), and then quotienting by the laws of symmetric monoidal categories. For instance, (3) is a morphism of Aut<sup>Σ</sup> of type →, and

$$
\begin{array}{c}
\rightharpoonup \rightharpoonup \rightharpoonup \rightharpoonup \text{is one of type } \blacktriangleright \blacktriangleright \dashv \rightarrow \blacktriangleright.
\end{array}
$$

*Semantics.* We first define the semantics for string diagrams simply as a function, and then discuss how to extend it to a functor from Aut<sup>Σ</sup> to another category. Our interpretation maps generating morphisms to relations between regular expressions and languages over Σ:

$$\left\{ \begin{array}{c} \begin{array}{c} \begin{array}{c} \\ \end{array} \end{array} \right\} = \left\{ \left( \begin{array}{c} \begin{array}{c} e \end{array} \right) \mid e \in \mathsf{RegExp} \right\} \end{array} \right\} \qquad \left\{ \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{array}{c} \\ \end{array} \end{array} \end{array} \right\} = \left\{ \left( e \end{array} \begin{array}{c} e \end{array} \right) \mid e \in \mathsf{RegExp} \right\} \end{array}$$

 = *e*,(*e*,*e*) <sup>|</sup> *<sup>e</sup>* <sup>∈</sup> RegExp - <sup>=</sup> {(*e*, •) <sup>|</sup> *<sup>e</sup>* <sup>∈</sup> RegExp} <sup>=</sup> {((*e*, *<sup>f</sup>*),*e f*) <sup>|</sup> *<sup>e</sup>*, *<sup>f</sup>* <sup>∈</sup> RegExp} - <sup>=</sup> {(•, 1)} *a* = (•, *<sup>a</sup>*) <sup>=</sup> {((*e*, *<sup>f</sup>*),*<sup>e</sup>* <sup>+</sup> *<sup>f</sup>*) <sup>|</sup> *<sup>e</sup>*, *<sup>f</sup>* <sup>∈</sup> RegExp} - <sup>=</sup> {(•, 0)} = *L*,(*K*1, *K*2) <sup>|</sup> *<sup>L</sup>* <sup>⊆</sup> *Ki*, *<sup>i</sup>* <sup>=</sup> 1, 2 and *<sup>L</sup>*, *<sup>K</sup>*1, *<sup>K</sup>*<sup>2</sup> <sup>⊆</sup> <sup>Σ</sup> = (*L*1, *L*2), *K* <sup>|</sup> *Li* <sup>⊆</sup> *<sup>K</sup>*, *<sup>i</sup>* <sup>=</sup> 1, 2 and *<sup>L</sup>*1, *<sup>L</sup>*2, *<sup>K</sup>* <sup>⊆</sup> <sup>Σ</sup> - <sup>=</sup> {(*L*, •) <sup>|</sup> *<sup>L</sup>* <sup>⊆</sup> <sup>Σ</sup> } <sup>=</sup> {(•,(*L*, *<sup>K</sup>*)) <sup>|</sup> *<sup>L</sup>* <sup>⊆</sup> *<sup>K</sup>* <sup>|</sup> *<sup>L</sup>*, *<sup>K</sup>* <sup>⊆</sup> <sup>Σ</sup> } - <sup>=</sup> {(•, *<sup>K</sup>*) <sup>|</sup> *<sup>K</sup>* <sup>⊆</sup> <sup>Σ</sup> } <sup>=</sup> {((*L*, *<sup>K</sup>*), •) <sup>|</sup> *<sup>K</sup>* <sup>⊆</sup> *<sup>L</sup>* <sup>|</sup> *<sup>L</sup>*, *<sup>K</sup>* <sup>⊆</sup> <sup>Σ</sup> } - <sup>=</sup> {((*L*, *<sup>K</sup>*), *<sup>L</sup>* <sup>⊆</sup> *<sup>K</sup>*) <sup>|</sup> *<sup>L</sup>*, *<sup>K</sup>* <sup>⊆</sup> <sup>Σ</sup> } - <sup>=</sup> {((*L*, *<sup>K</sup>*), *<sup>K</sup>* <sup>⊆</sup> *<sup>L</sup>*) <sup>|</sup> *<sup>L</sup>*, *<sup>K</sup>* <sup>⊆</sup> <sup>Σ</sup> } <sup>=</sup> {((*e*, *<sup>L</sup>*), *<sup>K</sup>*) <sup>|</sup> *<sup>L</sup> <sup>e</sup><sup>R</sup>* <sup>⊆</sup> *<sup>K</sup>* and *<sup>e</sup>* <sup>∈</sup> RegExp, *<sup>L</sup>*, *<sup>K</sup>* <sup>⊆</sup> <sup>Σ</sup> } (5)

In (5), the semantics *<sup>e</sup><sup>R</sup>* <sup>∈</sup> <sup>2</sup>*A*<sup>∗</sup> of a regular expression *e* ∈ RegExp is defined inductively on *e* (see (1)), in the standard way:

$$\begin{aligned} \left[\boldsymbol{\varepsilon} + f\right]\_{R} &= \left[\boldsymbol{\varepsilon}\right]\_{R} \cup \left[f\right]\_{R} & \left[\boldsymbol{\varepsilon} f\right]\_{R} &= \left\{vw \mid v \in \left[\boldsymbol{\varepsilon}\right]\_{R}, w \in \left[f\right]\_{R}\right\}\_{R} \\ \left[1\right]\_{R} &= \left\{\boldsymbol{\varepsilon}\right\} & \left[0\right]\_{R} = \mathcal{Q} & \left[a\right]\_{R} &= \left\{a\right\} & \left[\boldsymbol{\varepsilon}^{\*}\right]\_{R} &= \bigcup\_{n \in \mathbb{N}} \left[\boldsymbol{\varepsilon}^{n}\right]\_{R} \end{aligned}$$

where *en*+<sup>1</sup> := *ee<sup>n</sup>* and *e*<sup>0</sup> := 1. The semantics highlights the different roles played by red<sup>1</sup> and black generators. In a nutshell, red generators stand for regular expressions ( the sum, is 0, the product, is 1, the Kleene star, and *<sup>a</sup>* the letters of <sup>Σ</sup>), and black generators for operations on the set of languages ( is copy, is delete, and feed back outputs into inputs, in a way made more precise later). These two perspectives, which are usually merged, are kept distinct in our approach and only allowed to communicate via , which represents the product action of regular expressions (the red wire) on languages via concatenation on the right.

In order for this mapping to be functorial from AutΣ, we now introduce a suitable target semantic category. Interestingly, this will not be the category Rel of sets and relations: indeed, the identity morphisms and are not interpreted as identities of Rel. Instead, the semantic domain will be the category Prof**<sup>B</sup>** of *Boolean(-enriched) profunctors* [15] (also called in the literature relational profunctors [20] or weakening relations [32]).

**Definition 1.** *Given two preorders* (*X*, ≤*X*) *and* (*Y*, ≤*Y*)*, a* Boolean profunctor *<sup>R</sup>* : *<sup>X</sup>* → *Y is a relation R* ⊆ *<sup>X</sup>* × *Y such that if* (*x*, *<sup>y</sup>*) ∈ *R and x*<sup>4</sup> ≤*<sup>X</sup> <sup>x</sup>*, *<sup>y</sup>* ≤*<sup>Y</sup> y*4 *then* (*x*4 , *y*4 ) ∈ *R.*

<sup>1</sup> The reader with a greyscale version of the paper should see light grey generators instead.

*Preorders and Boolean profunctors form a symmetric monoidal category* Prof**<sup>B</sup>** *with composition given by relational composition. The identity for an object* (*X*, ≤*X*) *is the order relation* ≤*<sup>X</sup> itself. The monoidal product is the usual product of preorders.*

The rich features of our diagrammatic language are reflected in the profunctor interpretation. Indeed, the order relation is built into the wires and . The two possible directions represent the identities on the ordered set of languages and the same set with the reversed order, respectively. The additional red wire represents the set RegExp of regular expressions, with *equality* as the associated order relation.2 It is clear that all monochromatic generators sat-

isfy the condition of Definition 1. Similarly, the action generator is a Boolean profunctor: if ((*e*, *<sup>L</sup>*), *<sup>K</sup>*) are such that *<sup>L</sup> <sup>e</sup><sup>R</sup>* <sup>⊆</sup> *<sup>K</sup>* and *<sup>L</sup>*<sup>4</sup> <sup>⊆</sup> *<sup>L</sup>*, *<sup>K</sup>* <sup>⊆</sup> *<sup>K</sup>*<sup>4</sup> then we have *<sup>L</sup>*<sup>4</sup> *<sup>e</sup><sup>R</sup>* <sup>⊆</sup> *<sup>L</sup> <sup>e</sup><sup>R</sup>* <sup>⊆</sup> *<sup>K</sup>* <sup>⊆</sup> *<sup>K</sup>*<sup>4</sup> by monotony of the product of languages. We can conclude that

**Proposition 1.** -· *defines a symmetric monoidal functor of type* Aut<sup>Σ</sup> <sup>→</sup> Prof**B***.*

In particular, because Aut<sup>Σ</sup> is free, we can unambiguously assign meaning to any composite diagram from the semantics of its components using composition and the monoidal product in Prof**B**:

$$\begin{aligned} \left\{ \begin{matrix} \neg \boxed{\Box} \neg \boxed{\Box} \neg \end{matrix} \right\} &= \left\{ (L,K) \mid \exists M \left( L,M \right) \in \left[ \begin{matrix} \neg \boxed{\Box} \neg \end{matrix} \right), (M,K) \in \left[ \begin{matrix} \neg \boxed{\Box} \neg \end{matrix} \right] \right\} \\ \left\{ \begin{matrix} \neg \boxed{\Box} \end{matrix} \right\} &= \left\{ \left( (L\_1,L\_2), (K\_1,K\_2) \right) \mid (L\_i,K\_i) \in \left[ \begin{matrix} \neg \boxed{\Box} \neg \end{matrix} \right], i = 1,2 \right\} \end{aligned}$$

*Example 1.* We include here a worked out example to show how to compute the behaviour of a composite diagram which, as we will see, represents the action by concatenation of the regular language *a*∗. We assign variable names to each wire: *O* to the top wire of the feedback loop, *N* to the output wire of the action node, and *M* to the middle wire joining to so that we can compute:

$$\begin{aligned} \{ \{ \begin{array}{c} \Box \\ \Box \\ \Box \end{array} \begin{array}{c} \Box \\ \Box \\ \end{array} \} &= \{ \{ \begin{array}{c} (L,K) \mid \exists M,N,O, \ L,N \subseteq M, O \ [a]\_{R} \subseteq N, M \subseteq O,K \} \} \} \\ &= \{ (L,K) \mid \exists N,O, \ L,N \subseteq O, \ L,N \subseteq K \text{O} a \subseteq N \} \end{array} \} \\ &= \{ \{L,K\} \mid \exists O, \ Oa \subseteq O, \ L \subseteq O, \ L,O \subseteq K \} .\end{aligned} $$

Call this diagram *d*. Since *Oa* ⊆ *O* and *L* ⊆ *O* is equivalent to *L* ∪ *Oa* ⊆ *O*, *<sup>d</sup>* <sup>=</sup> {(*L*, *<sup>K</sup>*) | ∃*<sup>O</sup>* s.t. *<sup>L</sup>* <sup>∪</sup> *Oa* <sup>⊆</sup> *<sup>O</sup>*, *<sup>L</sup>*,*<sup>O</sup>* <sup>⊆</sup> *<sup>K</sup>*}. Finally, by Arden's lemma [2], *La*<sup>∗</sup> is the *least* solution of the language inequality *<sup>L</sup>* <sup>∪</sup> *Xa* <sup>⊆</sup> *<sup>X</sup>*; thus *<sup>d</sup>* <sup>=</sup> {(*L*, *<sup>K</sup>*) | ∃*<sup>O</sup>* s.t. *La*<sup>∗</sup> ⊆ *<sup>O</sup>*, *<sup>L</sup>*,*<sup>O</sup>* ⊆ *<sup>K</sup>*} = {(*L*, *<sup>K</sup>*) | *La*<sup>∗</sup> ⊆ *<sup>K</sup>*}.

### **3 Equational theory**

In Figure 1 we introduce =*KDA*, the (finite) equational theory of *Kleene Diagram Algebra*, on AutΣ. It will be later shown to be *complete* for the given semantics. We explain some salient features of =*KDA* below.

<sup>2</sup> Note that we can always consider any set with equality as a poset and that, therefore, Rel is a subcategory of Prof**B**, but not vice-versa, for the simple reason that the identity relation of an arbitrary poset in Prof**<sup>B</sup>** is not mapped to the identity relation in Rel.

**Fig. 1.** Equational theory =*KDA* of Kleene Diagram Algebra.


Let =*KDA* be the smallest equational theory containing all equations in Fig. 1. Their *soundness* for the chosen semantics is not difficult to show and, for space reasons, we omit the proof. We now state our *completeness* result, whose proof will be discussed in Section 5.

#### **Theorem 1 (Completeness).** *For morphisms d, e in* Aut<sup>Σ</sup> *, d* <sup>=</sup>*KDA e iff <sup>d</sup>* <sup>=</sup> *e.*

*Remark 1.* In the usual approach to the theory of regular languages (e.g. [26]), a completeness result like Theorem 1 is typically proven by first defining a class of models for the algebraic theory, and showing that the standard semantics constitutes the initial/free model. Our proof is different in flavour, but equivalent: taking advantage of the categorical formulation of our diagrammatic syntax and its semantics, we construct an equivalence of categories between our model and the diagrams quotiented by the equations of KDA.

*Remark 2.* Some axiomatisations of Kleene algebra use a partial order between terms, which can be defined from the idempotent monoid structure: *f* ≤ *e* iff *e* + *f* = *e*. At the semantic level, it corresponds to inclusion of languages. Similarly, using the idempotent bimonoid structure of our equational theory, we can define a partial order on → diagrams: *<sup>f</sup>* <sup>≤</sup> *<sup>e</sup>* iff *<sup>e</sup> <sup>f</sup>* <sup>=</sup> *<sup>e</sup>* . This partial order structure can also be extended to all morphisms *n*→*<sup>m</sup>* by using the vertical composition of *n* copies of and *m* copies of instead.

*Remark 3.* There are no specific equations relating the atomic actions *<sup>a</sup>* (*<sup>a</sup>* <sup>∈</sup> Σ). This is because, as we study automata, we are interested in the *free* monoid Σ∗ over Σ. However, nothing would prevent us from modelling other structures. Free commutative monoids (powers of **N**), whose rational subsets correspond to semilinear sets [14, Chapter 11] would be of particular interest.

### **4 Encoding regular expressions and automata**

A major appeal of our approach is that both regular expressions and automata can be uniformly represented in the graphical language of string diagrams, and the translation of one into the other becomes an equational derivation in =*KDA*. In fact, we will see there is a close resemblance between automata and the shape of the string diagrams interpreting them — the main difference being that string diagrams are *composable* structures.

In this section we describe how regular expressions (resp. automata) can be encoded as string diagrams, such that their semantics corresponds in a precise way to the languages that they describe (resp. recognise).

In a sense, regular expressions are already part of the graphical syntax, as the red generators: for any regular expression *e*, one may always construct a 'red' string diagram *<sup>e</sup>* : 0 <sup>→</sup> such that *<sup>e</sup>* <sup>=</sup> {(•,*e*)}. However, these alone are meaningless, since their image under the semantics is simply the free term algebra RegExp (see (7)) . They acquire meaning as they *act* on the set of languages over Σ, represented by the black wire.

#### **4.1 From regular expressions to string diagrams**

To define these encodings, it is convenient to introduce the following syntactic sugar. We will write *<sup>e</sup>* for the composite of *<sup>e</sup>* with the action, as defined below left, with the particular case of a letter *a* ∈ Σ on the right:

$$
\begin{array}{ccc}
\mathsf{Q} \leftarrow = \neg \boxplus & & \neg \boxplus \neg \to \\
\neg \boxplus & & \neg \boxplus \neg \end{array}
\tag{6}
$$

Using this action, we can inductively define an encoding − of regular expressions into string diagrams of AutΣ, as the rightmost diagram for each expression below:

For example, *ab*(*<sup>a</sup>* + *ab*)∗ =

As expected, the translation preserves the language interpretation of regular expressions in a sense that the following proposition makes precise.

**Proposition 2.** *For any regular expression e, <sup>e</sup>* <sup>=</sup> {(*L*, *<sup>K</sup>*) <sup>|</sup> *<sup>e</sup><sup>R</sup> <sup>L</sup>* <sup>⊆</sup> *<sup>K</sup>*}*.*

#### **4.2 From automata to string diagrams...**

Example (8) suggests that the string diagram *e* corresponding to a regular expression *e* looks a lot like a nondeterministic finite-state automaton (NFA) for *e*. In fact, the translation − can be seen as the diagrammatic counterpart of Thompson's construction [40] that builds an NFA from a regular expression.

We can generalise the encoding of regular expressions and translate NFA directly into string diagrams, in at least two ways. The first is to encode an NFA as the diagrammatic counterpart of its transition relation. The second is to translate directly its graph representation into the diagrammatic syntax.

*Encoding the transition relation.* This is a simple variant of the translation of matrices over semirings that has appeared in several places in the literature [29,42].

Let *A* be an NFA with set of states *Q*, initial state *q*<sup>0</sup> ∈ *Q*, accepting states *F* ⊆ *Q* and transition relation *δ* ⊆ *Q* × Σ × *Q*. We can represent *δ* as a string diagram *d* with |*Q*| incoming wires on the left and |*Q*| outgoing wires on the right.The left *j*th port of *d* is connected to the *i*th port on the right through an *<sup>a</sup>* whenever (*qi*, *<sup>a</sup>*, *qj*) ∈ *<sup>δ</sup>*. To accommodate nondeterminism, when the same two ports are connected by several different letters of Σ, we join these using and . When (*qi*, , *qj*) ∈ *<sup>δ</sup>*, the two ports are simply connected via a plain identity wire. If there is no tuple in *<sup>δ</sup>* such that (*qi*, *<sup>a</sup>*, *qj*) ∈ *δ* for any *a*, the two corresponding ports are disconnected.

For example, the transition relation of an NFA with three states and *δ* = {((*q*0, *<sup>a</sup>*, *<sup>q</sup>*1),(*q*1, *<sup>b</sup>*, *<sup>q</sup>*2),(*q*2, *<sup>a</sup>*, *<sup>q</sup>*1),(*q*2, *<sup>a</sup>*, *<sup>q</sup>*2))} (disregarding the initial and accepting states for the moment) is depicted on the right. Conversely, given such a diagram, we can recover *δ* by collecting Σ-weighted paths from left to right ports.

To deal with the initial state, we add an additional incoming wire connected to the right port corresponding to the initial state of the automaton. Similarly, for accepting states we add an additional outgoing wire, connected to the left ports corresponding to each accepting state, via if there is more than

one. Finally, we trace out the |*Q*| wires of the diagrammatic transition relation to obtain the associated string diagram. In other words, for a NFA with initial state *q*0, set of accepting states *F*, transition relation *δ*, we obtain the string diagram on the right, where *d* is the diagrammatic counterpart of

*δ* as defined above, *e*<sup>0</sup> is the injection of a single wire as the first amongst |*Q*| wires, and *f* deletes all wires that are not associated to states in *F* with , and applies to merge them into a single outgoing wire.

For example, if *A* with *δ* as above has initial state *q*<sup>0</sup> and accepting state {*q*2}, we get the diagram below left; instead, if all states are accepting, we obtain the diagram below right:

The correctness of this simple translation is justified by a semantic correspondence between the language recognised by a given NFA *A* and the denotation of the corresponding string diagram.

**Proposition 3.** *Given an NFA A which recognises the language L, let dA be its associated string diagram, constructed as above. Then dA* <sup>=</sup> {(*K*, *<sup>K</sup>*<sup>4</sup> ) | *LK* ⊆ *<sup>K</sup>*<sup>4</sup> }*.*

*From graphs to string diagrams.* The second way of translating automata into string diagrams mimics more directly the combinatorial representation of automata. The idea (which should be sufficiently intuitive to not need to be made formal here) is, for each state, to use to represent incoming edges, and to represent outgoing edges. As above, labels *a* ∈ *A* will be modelled using *<sup>a</sup>* . For example, the graph and the associated string diagram corresponding with the NFA above are

Note the initial state of the automaton corresponds to the left interface of the string diagram, and the accepting state to the right interface. As before, when there are multiple accepting states, they all connect to a single right interface, via . For example, if we make all states accepting in the automaton above, we get the following diagrammatic representation:

#### **4.3 ...and back**

The previous discussion shows how NFAs can be seen as string diagrams of type →. The converse is also true: we now show how to extract an automaton from any string diagram *<sup>d</sup>* : →, such that the language the automaton recognises matches the denotation of *d*.

In order to phrase this correspondence formally, we need to introduce some terminology. We call *left-to-right* those string diagrams whose domain and codomain contain only , i.e. their type is of the form *n*→*m*. The idea is that, in any such string diagram, the *n* left interfaces act as *inputs* of the computation, and the *m* right interfaces act as *outputs*. For instance, (9) is a left-to-right diagram →.

A string diagram *d* is *atomic* if the only red generators occurring in *d* are of the form *<sup>a</sup>* . By *unfolding* all red components *<sup>e</sup>* in any left-to-right diagram, using axioms (C1)-(C5), we can prove the following statement.

**Proposition 4.** *Any left-to-right diagram is* =*KDA-equivalent to an atomic one.*

For instance, the string diagram on the left of (8) is =*KDA*-equivalent to the atomic one on the right.

We call *block* of a certain subset of generators a vertical composite of these generators followed by some permutations of the wires.

**Definition 2.** *A* matrix-diagram *(resp.* generalised matrix-diagram*) is a left-toright diagram that factors as a block of* , *, followed by a block of <sup>a</sup> for a* ∈ Σ *(resp. <sup>e</sup> for e* ∈ RegExp*) and finally, a block of* , *.*

To each matrix-diagram *d* we can associate a unique transition relation *δ* by gathering paths from each input to each output: (*qi*, *<sup>a</sup>*, *qj*) ∈ *<sup>δ</sup>* if there is *<sup>a</sup>* joining the *i*th input to the *j*th output.

A transition relation is *-free* if it does not contain the empty word. It is *deterministic* if it is -free and, for each *i* and each *a* ∈ Σ there is at most one *j* such that (*qi*, *<sup>a</sup>*, *qj*) ∈ *<sup>δ</sup>*. We will apply these terms to matrixdiagrams and the associated transition relation inter-

changeably. The example of Section 4.2 above, with the three blocks highlighted, is a matrix-diagram. It is -free but not deterministic since there are two *a*labelled transitions starting from the third input.

Given a matrix-diagram *<sup>d</sup>* : *l*+*n*→*p*+*m*, we will write *dij*, with *<sup>i</sup>* <sup>=</sup> *<sup>l</sup>*, *<sup>n</sup>* and *j* = *p*, *m*, for the subdiagrams corresponding to the appropriate submatrices.

**Definition 3.** *For any left-to-right diagram d* : *n*→*m, a* representation *is a matrixdiagram* <sup>ˆ</sup>*<sup>d</sup>* : *l*+*n*→*l*<sup>+</sup>*m, such that <sup>d</sup> <sup>n</sup> <sup>m</sup>* <sup>=</sup> <sup>ˆ</sup>*<sup>d</sup> <sup>n</sup> <sup>m</sup> l and* ˆ*dll,* ˆ*dnl are -free. It is a* deterministic representation *if moreover* ˆ*dll is deterministic.*

For example, given the string diagram below on the left, the one on the right is a representation for it, whose highlighted matrix-diagram is the same as above.

We will refer to the associated matrix-diagram ˆ*d* as the *transition matrix* of a given representation. From a → diagram with representation <sup>ˆ</sup>*<sup>d</sup>* : *l*+1→*l*+<sup>1</sup> we can construct an NFA from its transition matrix ˆ*d* as follows:


The construction above is the inverse of that of Section 4.2. The link between the constructed automaton and the original string diagram is summarised in the following statement, which is a straightforward corollary of Proposition 3.

**Proposition 5.** *For a diagram d* : → *with a representation* <sup>ˆ</sup>*d, let A* <sup>ˆ</sup>*<sup>d</sup> be the associated automaton, constructed as above. Then L is the language recognised by A* <sup>ˆ</sup> <sup>ˆ</sup>*<sup>d</sup> iff <sup>d</sup>* <sup>=</sup> (*K*, *K*4 ) <sup>|</sup> *LK*<sup>ˆ</sup> <sup>⊆</sup> *<sup>K</sup>*<sup>4</sup> *.*


#### **Proposition 6.** *Any left-to-right diagram has a representation.*

We established a correspondence between → diagrams and automata. What about arbitrary left-to-right diagrams *n*→*m*? To characterise the precise relationship between our syntax and regular expressions we can prove a *Kleene theorem* for AutΣ. Recall, from Definition 2 that a *generalised matrix-diagram* is the diagrammatic counterpart of a matrix whose coefficients are regular expressions. It turns out that every left-to-right diagram can be put in this form.

**Proposition 7 (Kleene's for** AutΣ**).** *Any left-to-right diagram is equal to a generalised matrix diagram.*

As a result, the semantics of a given *n*→*<sup>m</sup>* diagram is fully characterised by an *m* × *n* array of regular languages.

#### **4.4 Interlude: from regular to context-free languages**

It is worth pointing out how a simple modification of the diagrammatic syntax takes us one notch up the Chomsky hierarchy, leaving the realm of regular languages for that of context-free grammars and languages.

Our syntax allows to specify systems of language equations of the form *aX* ⊆ *Y*. In this context, feedback loops can be interpreted as fixed-points. For example, the automaton below left, and its corresponding string diagram, below right, translate to the system of equations at the center:

This translation can be obtained by simply labelling each state with a variable and adding one inequality of the form *Xia* ⊆ *Xj* for each *a*-transition from state *<sup>i</sup>* to state *<sup>j</sup>*. The system we obtain corresponds very closely to the -<sup>−</sup>-semantics of the associated string diagram.

The distinction between red and black wires can be understood as a type discipline that only allows linear uses of the product of languages. It is legitimate and enlightening to ask what would happen if we forgot about red wires and interpreted the action directly as the product. We would replace the action by a new generator with semantics  = { (*M*, *L*), *K* | *ML* ⊆ *K*}.

This would allow us to specify systems of language equations with unrestricted uses of the product on the left of inclusions, e.g. *UVW* ⊆ *X*. Equations of this form are similar to the production rules (e.g. *X* → *UVW*) of context-free grammars and it is well-known that the least solutions of this class of systems are precisely *context-free* languages [14, Chapter 10].

For example we could encode the language *<sup>X</sup>* → *XX* | (*X*) | of properly matched parentheses as least solution of the system ⊆ *<sup>X</sup>*,(*X*) ⊆ *<sup>X</sup>*, *XX* ⊆ *<sup>X</sup>* which gives the diagram displayed on the right.

#### **5 Completeness and Determinisation**

This section is devoted to prove our completeness result, Theorem 1. We use a normal form argument: more specifically we mimic automata-theoretic results to rewrite every string diagram to a normal form corresponding to a minimal deterministic finite automaton (DFA). We achieve it by implementing Brzozowski's algorithm [12] through diagrammatic equational reasoning. The proof proceeds in three distinct steps.


We will now write equations in =*KDA* simply as = to simplify notation and say that diagrams *c* and *d* are *equal* when *c* =*KDA d*.

First, we use the symmetries of the equational theory to make simplifying assumptions about the diagrams to consider in the completeness proof.

*A few simplifying assumptions.* Without loss of generality, the proof we give is restricted to string diagrams with no in their domain as well as in their codomain. This is simply a matter of convenience: the same proof would work for more general diagrams, that may contain in their (co)domain, at the cost of significantly cluttering diagrams. Henceforth, one can simply think of the labels for the action *<sup>x</sup>* as uniquely identifying one open red wire in a diagram. With this convention, two or more occurrences of the same *x* in a diagram can be seen as connected to the same red wire on the left, via . That we can safely do so is a consequence of the completeness of =*KDA* restricted to the monochromatic red fragment, itself a consequence of [11, Theorem 6.1].

Arbitrary objects in Aut<sup>Σ</sup> are lists of the three generating objects. We have already motivated focusing on string diagrams with no open red wires so that the objects we care about are lists of and . The following proposition implies that, without loss of generality, for the proof of completeness we can restrict further to left-to-right diagrams (Section 4.2).

**Proposition 8.** *There is a natural bijection between sets of string diagrams of the form*

Proposition 8 tell us that we can always bend the incoming wires to the left and outgoing wires to the right before applying some equations, and recover the original orientation of the wires by bending them into their original place later.

#### **5.1 Determinisation**

In diagrammatic terms, a nondeterministic transition of the automaton associated to (a representation of) a given diagram, corresponds to a subdiagram of the form *<sup>a</sup> <sup>a</sup>* for some *<sup>a</sup>* <sup>∈</sup> <sup>Σ</sup>. Clearly, using the definition of *<sup>a</sup>* :<sup>=</sup> *a* in (6) and the axiom (D1) = , we have *<sup>a</sup> <sup>a</sup>* <sup>=</sup> *<sup>a</sup>* , which will prove to be the engine of our determinisation procedure, along with the fact that any red expression can be copied and deleted. The next two theorems generalise the ability to copy and delete to arbitrary left-to-right diagrams.

**Theorem 2.** *For any left-to-right diagram d* : *m*→*n, we have*

For *<sup>d</sup>* : *m*→*n*, let *dij* be the string diagram of type → obtained by composing every input with except the *i*th one, and every output with except the *j*th one. Theorem 2 implies that string diagrams are fully characterised by their → subdiagrams.

**Corollary 1.** *Given d*,*<sup>e</sup>* : *m*→*n, d* <sup>=</sup>*KDA e iff dij* <sup>=</sup>*KDA eij, for all* <sup>1</sup> <sup>≤</sup> *<sup>i</sup>* <sup>≤</sup> *m and* 1 ≤ *j* ≤ *n.*

Thus, we can restrict our focus further to left-to-right → diagrams, without loss of generality. We are now able to devise a determinisation procedure for representation of diagrams, which we illustrate below on a simple example.

**Proposition 9 (Determinisation).** *Any diagram* → *has a deterministic representation.*

*Dealing with useless states.* Notice that our deterministic form is *partial* and that the determinisation procedure disregards *useless states*, i.e., parts of a string diagram that do not reach an output wire. None of these contribute to the semantics of the diagram and can be safely eliminated using Theorem 2 (del)-(co-del).

### **5.2 Minimisation and completeness**

As explained above, our proof of completeness is a diagrammatic reformulation of Brzozowski's algorithm which proceeds in four steps: determinise, reverse, determinise, reverse. We already know how to determinise a given diagram. The other three steps are simply a matter of looking at string diagrams differently and showing that all the equations that we needed to determinise them, can be performed in reverse.

We say that a matrix-diagram is *co-deterministic* if the converse of its associated transition relation is deterministic.

*Proof (Theorem 1 (Completeness)).* We have a procedure to show that, if *<sup>d</sup>* <sup>=</sup> *e*, then there exists a string diagram *f* in normal form such that *d* = *f* = *e*. This normal form is the diagrammatic counterpart of the *minimal* automaton associated to *d* and *e*. In our setting, it is the deterministic representation equal to *d* and *e* with the smallest number of states. This is unique because we can obtain from it the corresponding minimal automaton, which is well-known to be unique. First, given any string diagram we can obtain a representation for it by Proposition 6. Then we obtain a minimal representation by splitting Brzozowski's algorithm in two steps.


### **6 Discussion**

In this paper, we have given a fully diagrammatic treatment of finite-state automata, with a finite equational theory that axiomatises them up to language equivalence. We have seen that this allows us to decompose the regular operations of Kleene algebra, like the star, into more primitive components, resulting in greater modularity. In this section, we compare our contributions with related work, and outline directions for future research.

Traditionally, computer scientists have used *syntax or railroad diagrams* to visualise regular expressions and context-free grammars [41]. These diagrams resemble our very closely but have remained mostly informal More recently, Hinze has treated the single input-output case rigorously as a pedagogical tool to teach the correspondence between finite-state automata and regular expressions [18]. He did not, however, study their equational properties.

Bloom and Ésik's *iteration theories* provide a general categorical setting in which to study the equational properties of iteration for a broad range of structures that appear in programming languages semantics [5]. They are cartesian categories equipped with a parameterised fixed-point operation closely related to the feedback notion we have used to represent the Kleene star. However, the monoidal category of interest in this paper is *compact-closed* (only the full subcategory over and to be precise), a property that is incompatible with the existence of categorical products (any category that has both collapses to a preorder [30]). Nevertheless, the subcategory of left-to-right diagrams (Section 4.2) is a (matrix) iteration theory [6], a structure that Bloom and Ésik have used to give an (infinitary) axiomatisation of regular languages [4].

Similarly, Stefanescu's work on *network algebra* provides a unified algebraic treatment of various types of networks, including finite-state automata [39]. In general, network algebras are traced monoidal categories where the product is not necessarily cartesian, and therefore more general than iteration theories. In both settings however, the trace is a global operation, that cannot be decomposed further into simpler components. In our work, on the other hand, the trace can be defined from the compact-closed structure, as was depicted in (3).

Note that the compact closed subcategory in this paper can be recovered from the traced monoidal category of left-to-right diagrams, via the *Int construction* [22]. Therefore, as far as mathematical expressiveness is concerned, the two approaches are equivalent. However, from a methodological point of view, taking the compact closed structure as primitive allows for improved compositionality, as example (2) in the introduction illustrates. Furthermore, the compact closed structure can be finitely presented relative to the theory of symmetric monoidal categories, whereas the trace operation cannot. This matters greatly in this paper, where finding a finite axiomatisation is our main concern.

Finally, the idea of treating regular expressions as a free structure acting on a second algebraic structure also appeared in Pratt's *dynamic algebras*, which axiomatise the propositional fragment of dynamic modal logic [34]. Like our formalism, the variety of dynamic algebras is finitely-based. But they assume more structure: the second algebraic structure is a Boolean algebra.

In all the formalisms we have mentioned, the difficulty typically lies in capturing the behaviour of iteration—whether as the star in Kleene algebra [26,4], or a trace operator [5] in iteration theory and network algebra [39]. The axioms should be coercive enough to force it to be *the least fixed-point* of the language map *L* → {} ∪ *LK*. In Kozen's axiomatisation of Kleene algebra [26] for example, this is through (a) the axiom 1 + *ee*<sup>∗</sup> ≤ *<sup>e</sup>*<sup>∗</sup> (star is a fixpoint) and (b) the Horn clause *<sup>f</sup>* + *ex* ≤ *<sup>x</sup>* ⇒ *<sup>e</sup>*<sup>∗</sup> *<sup>f</sup>* ≤ *<sup>x</sup>* (star is the least fixpoint). In our work, (a) is a consequence of the unfolding of the star into a feedback loop and can be derived from the other axioms. (b) is more subtle, but can be seen as a consequence of (D1)-(D4) axioms. These allows us to (co)copy and (co)delete arbitrary diagrams (Theorem 2) and we conjecture that this is what forces the star to be a single definite value, not just any fixed-point, but the least one. Making this statement precise is the subject of future work.

The difficulty in capturing the behaviour of fixed-points is also the reason why we decided to work with an additional red wire, to encode the action of regular expressions on the set of languages—without it, global (co)copying and (co)deleting (Theorem 2) cannot be reduced to the local (D1)-(D4) axioms. There is another route, that leads to an infinitary axiomatisation: we could dispense with the red generators altogether and take *<sup>a</sup>* (for *a* ∈ Σ) as primitive instead, with global axioms to (co)copy and (co)delete arbitrary diagrams. This would pave the way for a reformulation of our work in the context of iteration (matrix) theories, where the ability to (co)copy and (co)delete arbitrary expressions is already built-in. We leave this for future work.

There is an intriguing parallel between our case study and the positive fragment of relation algebra (also known as allegories [16]). Indeed, allegories, like Kleene algebra, do not admit a finite axiomatisation [16]. However, this result holds for standard algebraic theories. It has been shown recently that a structure equivalent to allegories can be given a finite axiomatisation when formulated in terms of string diagrams in monoidal categories [9]. It seems like the greater generality of the monoidal setting—algebraic theories correspond precisely to the particular case of cartesian monoidal categories [11]—allows for simpler axiomatisations in some specific cases. In the future we would like to understand whether this phenomenon, of which now we have two instances, can be understood in a general context.

Lastly, extensions of Kleene Algebra, such as Concurrent Kleene Algebra (CKA) [19,23] and NetKAT [1], are increasingly relevant in current research. Enhancing our theory =*KDA* to encompass these extensions seems a promising research direction, for two main reasons. First, the two-dimensional nature of string diagrams has been proven particularly suitable to reason about concurrency (see e.g. [7,38]), and more generally about resource exchange between processes (see e.g. [10,13,21,3,8]). Second, when trying to transfer the good meta-theoretical properties of Kleene Algebra (like completeness and decidability) to extensions such as CKA and NetKAT, the cleanest way to proceed is usually in a modular fashion. The interaction between the new operators of the extension and the Kleene star usually represents the greatest challenge to this methodology. Now, in =*KDA*, the Kleene star is decomposable into simpler components (see (3)) and there is only one specific axiom (C5) governing its behaviour. We believe this is a particularly favourable starting point to modularise a meta-theoretic study of CKA and NetKAT with string diagrams, taking advantage of the results we presented in this paper for finite-state automata.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

#### Work-sensitive Dynamic Complexity of Formal Languages*-*

Jonas Schmidt (-)<sup>1</sup>, Thomas Schwentick<sup>1</sup>, Till Tantau<sup>2</sup>, Nils Vortmeier<sup>3</sup>, and Thomas Zeume<sup>4</sup>

> <sup>1</sup> TU Dortmund University, Dortmund, Germany {jonas2.schmidt,thomas.schwentick}@tu-dortmund.de <sup>2</sup> Universität zu Lübeck, Lübeck, Germany tantau@tcs.uni-luebeck.de <sup>3</sup> University of Zurich, Zurich, Switzerland nils.vortmeier@uzh.ch <sup>4</sup> Ruhr University Bochum, Bochum, Germany thomas.zeume@rub.de

Abstract. Which amount of parallel resources is needed for updating a query result after changing an input? In this work we study the amount of work required for dynamically answering membership and range queries for formal languages in parallel constant time with polynomially many processors. As a prerequisite, we propose a framework for specifying dynamic, parallel, constant-time programs that require small amounts of work. This framework is based on the dynamic descriptive complexity framework by Patnaik and Immerman.

Keywords: Dynamic complexity · work · parallel constant time.

### 1 Introduction

Which amount of parallel resources is needed for updating a query result after changing an input, in particular if we only want to spend constant parallel time?

In classical, non-dynamic computations, parallel constant time is well understood. Constant time on CRAMs, a variant of CRCW-PRAMs used by Immerman [15], corresponds to constant-depth in circuits, so, to the circuit class AC<sup>0</sup>, as well as to expressibility in first-order logic with built-in arithmetic (see, for instance, the books of Immerman [15, Theorem 5.2] and Vollmer [26, Theorems 4.69 and 4.73]). Even more, the amount of work, that is, the overall number of operations of all processors, is connected to the number of variables required by a first-order formula [15, Theorem 5.10].

However, the work aspect of constant parallel time algorithms is less understood for scenarios where the input is subject to changes. To the best of our knowledge, there is only little previous work on constant-time PRAMs in dynamic scenarios. A notable exception is early work showing that spanning trees

 A full version of the paper is available at [21], https://arxiv.org/abs/2101.08735

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 490–509, 2021.

https://doi.org/10.1007/978-3-030-71995-1\_25

and connected components can be computed in constant time by CRCW-PRAMs with O(n<sup>4</sup>) and O(n<sup>2</sup>) processors, respectively [24].

In an orthogonal line of research, parallel dynamic constant time has been studied from a logical perspective in the dynamic complexity framework by Patnaik and Immerman [20] and Dong, Su, and Topor [7,6]. In this framework, the update of query results after a change is expressed by first-order formulas. The formulas may refer to auxiliary relations, whose updates in turn are also specified by first-order formulas (see Section 3 for more details). The queries maintainable in this fashion constitute the dynamic complexity class DynFO. Such queries can be updated by PRAMs in constant time with a polynomial number of processors. In this line of work, the main focus in recent years has been on proving that queries are in DynFO, and thus emphasised the constant time aspect. It has, for instance, been shown that all context-free languages [11] and the reachability query [5] are in DynFO.

However, if one tries to make the "DynFO approach" for dynamic problems relevant for practical considerations, the work that is needed to carry out the specified updates, hence the *work* of a parallel algorithm implementing them, is a crucial factor. The current general polynomial upper bounds are too coarse. In this paper, we therefore initiate the investigation of more work-efficient dynamic programs that can be specified by first-order logic and that can therefore be carried out by PRAMs in constant time. To do so, we propose a framework for specifying such dynamic, parallel, constant-time programs, which is based on the DynFO framework, but allows for more precise (and better) bounds on the necessary work of a program.

Goal 1.1. *Extend the formal framework of dynamic complexity towards the consideration of parallel work.*

Towards this goal, we link the framework we propose to the CRAM framework in Section 3. In fact, the new framework also takes a somewhat wider perspective, since it does not focus exclusively at one query under a set of change operations, but rather considers dynamic problems that may have several change and query operations (and could even have operations that combine the two). Therefore, from now on we speak about dynamic problems and not about (single) queries.

Goal 1.2. *Find work-efficient* DynFO*-programs for dynamic problems that are known to be in* DynFO *(but whose dynamic programs*<sup>5</sup> *are not competitive, workwise).*

Ideally we aim at showing that dynamic problems can be maintained in DynFO with sublinear or even polylogarithmic work. One line of attack for this goal is to study dynamic algorithms and to see whether they can be transformed into parallel O(1)-time algorithms with small work. There is a plethora of work

<sup>5</sup> In the field of dynamic complexity the term "dynamic program" is traditionally used for the programs for updating the auxiliary data after a change. The term should not be confused with the "dynamic programming" technique used in algorithm design.

that achieves polylogarithmic sequential update time (even though, sometimes only amortised), see for instance [3,9,12,13]. For many of these problems, it is known that they can be maintained in constant parallel time with polynomial work, e.g. as mentioned above, it has been shown that connectivity and maintenance of regular (and even context-free) languages is in DynFO.

In this paper, we follow this approach for dynamic string problems, more specifically, dynamic problems that allow membership and range queries for regular and context-free languages. Our results can be summarised as follows.

We show in Section 5 that regular languages can be maintained in constant time with <sup>O</sup>(n) work for all > <sup>0</sup> and that for star-free languages even work O(log n) can be achieved. These results hold for range and membership queries.

For context-free languages, the situation is not as nice, as we observe in Section 6. We show that subject to a well-known conjecture, we cannot hope for maintaining membership in general context-free languages in DynFO with less than <sup>O</sup>(n<sup>1</sup>.37−) work. The same statement holds even for the bound <sup>O</sup>(n<sup>2</sup>−) and "combinatorial dynamic programs". For Dyck languages, that is, sets of wellformed strings of parentheses, we show that this barrier does not apply. Their membership problem can be maintained with <sup>O</sup>(n(log <sup>n</sup>)<sup>3</sup>) work in general, and with polylogarithmic work if there is only one kind of parentheses. By a different approach, range queries can be maintained with work <sup>O</sup>(n1+) in general, and <sup>O</sup>(n) for one parenthesis type.

*Related work.* A complexity theory of incremental time has been developed in [19]. We discuss previous work on dynamic complexity of formal languages in Sections 5 and 6.

### 2 Preliminaries

Since dynamic programs are based on first-order logic, we represent inputs like graphs and strings as well as "internal" data structures as logical structures.

A *schema* τ consists of a set of relation symbols and function symbols with a corresponding arity. A constant symbol is a function symbol with arity 0. A *structure* D over schema τ with finite domain D has, for every k-ary relation symbol <sup>R</sup> <sup>∈</sup> <sup>τ</sup> , a relation <sup>R</sup><sup>D</sup> <sup>⊆</sup> <sup>D</sup><sup>k</sup>, as well as a function <sup>f</sup> <sup>D</sup> : <sup>D</sup><sup>k</sup> <sup>→</sup> <sup>D</sup> for every k-ary function symbol f ∈ τ . We allow partially defined functions and write f <sup>D</sup>(¯a) = ⊥ if f <sup>D</sup> is not defined for a¯ in D. Formally, this can be realized using an additional relation that contains the domain of f <sup>D</sup>. We occasionally also use functions <sup>f</sup> <sup>D</sup> : <sup>D</sup><sup>k</sup> <sup>→</sup> <sup>D</sup> for some > 1. Formally, such a function represents functions f <sup>D</sup> <sup>1</sup> ,...,f <sup>D</sup> : <sup>D</sup><sup>k</sup> <sup>→</sup> <sup>D</sup> with <sup>f</sup> <sup>D</sup>(¯a) def = (<sup>f</sup> <sup>D</sup> <sup>1</sup> (¯a),...,f <sup>D</sup> (¯a)).

Throughout this work, the structures we consider provide a linear order ≤ on their domain D. As we can thus identify D with an initial sequence of the natural numbers, we usually just assume that D = [n] def = {0,...,n−1} for some natural number n.

We assume familiarity with first-order logic FO, and refer to [17] for basics of Finite Model Theory. In this paper, unless stated otherwise, first-order formulas *always* have access to a linear order on the domain, as well as compatible functions + and × that express addition and multiplication, respectively. This holds in particular for formulas in dynamic programs. We use the following "if-thenelse" construct: if ϕ is a formula, and t<sup>1</sup> and t<sup>2</sup> are terms, then ITE(ϕ, t1, t2) is a term. Such a term evaluates to the result of t<sup>1</sup> if ϕ is satisfied, otherwise to t2.

Following [11], we encode words of length (at most) n over an alphabet Σ by *word structures*, that is, as relational structures W with universe {0,...,n − 1}, one unary relation R<sup>σ</sup> for each symbol σ ∈ Σ and the canonical linear order ≤ on {0,...,n − 1}. We only consider structures for which, for every position i, Rσ(i) holds for at most one σ ∈ Σ and write W(i) = σ if Rσ(i) holds and W(i) = if no such σ exists. We write word(W) for the word represented by W, that is, the concatenation w = W(0) ◦ ... ◦W(n−1). As an example, the word structure W<sup>0</sup> with domain {0, 1, 2, 3}, W(1) = a, W(3) = b and W(0) = W(2) = represents the string ab. We write word(W)[ , r] for the word W( ) ◦ ... ◦ W(r).

Informally, a *dynamic problem* can be seen as a data type: it consists of some underlying structure together with a set Δ of operations. We distinguish between *change operations* that can modify the structure and *query operations* that yield information about the structure, but combined operations could be allowed, as well. Thus, a dynamic problem is characterised by the schema of its underlying structures and the operations that it supports.<sup>6</sup>

In this paper, we are particularly interested in dynamic language problems, defined as follows. Words are represented as word structures W with elementary change operations setσ(i) (with the effect that W(i) becomes σ if it was before) and reset(i) (with the effect that W(i) becomes ).

For some fixed language L over some alphabet Σ, the dynamic problem RangeMember(L) further supports one query operation range( , r). It yields the result true, if word(W)[ , r] is in L, and otherwise false.

In the following, we denote a word structure W as a sequence w<sup>0</sup> ...w<sup>n</sup>−<sup>1</sup> of letters with w<sup>i</sup> ∈ Σ ∪ {} in order to have an easier, less formal notation. Altogether, the dynamic problem RangeMember(L) is defined as follows.

#### Problem: RangeMember(L)

Input: A sequence w = w<sup>0</sup> ...w<sup>n</sup>−<sup>1</sup> of letters with w<sup>i</sup> ∈ Σ ∪ {} Changes: setσ(i) for <sup>σ</sup> <sup>∈</sup> <sup>Σ</sup>: Sets <sup>w</sup><sup>i</sup> to <sup>σ</sup>, if <sup>w</sup><sup>i</sup> <sup>=</sup> reset(i): Sets w<sup>i</sup> to

Queries: range( , r): Is w ◦···◦ w<sup>r</sup> ∈ L?

In this example, the query range maps (binary) pairs of domain elements to a truth value and thus defines a (binary) relation over the universe of the input word structure. We call such a query *relational*. We will also consider *functional* queries mapping tuples of elements to elements.

Another dynamic problem considered here is Member(L) which is defined similarly as RangeMember(L) but instead of range only has the Boolean query operation member that yields true if <sup>w</sup><sup>0</sup> ◦ ... ◦ <sup>w</sup><sup>n</sup>−<sup>1</sup> <sup>∈</sup> <sup>L</sup> holds.

<sup>6</sup> This view is a bit broader than the traditional setting of Dynamic Complexity, where there can be various change operations but usually only one fixed query is supported.

### 3 Work-sensitive Dynamic Complexity

Since we are interested in the work that a dynamic program does, our specification mechanism for dynamic programs is considerably more elaborated than the one used in previous papers on dynamic complexity. We introduce the mechanism in this section in two steps. First the general form of dynamic programs and then a more pseudo-code oriented syntax. Afterwards, we discuss how these dynamic programs translate into work-efficient constant-time parallel programs.

#### 3.1 The Dynamic Complexity Framework

Our general form of dynamic programs mainly follows [23], but is adapted to the slightly broader view of a dynamic problem as a data type. For a more gentle introduction to dynamic complexity, we refer to [22].

The goal of a *dynamic program* for a dynamic problem Π is to support all its operations Δ. To do so, it stores and updates an auxiliary structure A over some schema τaux, over the same domain as the input structure I for Π.

A (first-order) dynamic program P consists of a set of (first-order) *update rules* for change operations and *query rules* for query operations. More precisely, a program has one query rule over schema τaux per query operation that specifies how the (relational) result of that operation is obtained from the auxiliary structure. Furthermore, for each change operation δ ∈ Δ, it has one update rule per auxiliary relation or function that specifies the updates after a change based on δ.

A query rule is of the form on query Q(¯p) yield ϕQ(¯p), where ϕ<sup>Q</sup> is the (first-order) *query formula* with free variables from p¯.

An update rule for a k-ary auxiliary relation R is of the form

#### on change δ(¯p) update R at (t1(¯p; ¯x),...,tk(¯p; ¯x)) as ϕ<sup>R</sup> <sup>δ</sup> (¯p; ¯x) where C(¯x).

Here, ϕ<sup>R</sup> <sup>δ</sup> is the (first-order) *update formula*, t1,...,t<sup>k</sup> are first-order terms (possibly using the ITE construct) over τaux, and C(¯x), called a *constraint* for the tuple x¯ = x1,...,x of variables, is a conjunction of inequalities x<sup>i</sup> ≤ fi(n) using functions <sup>f</sup><sup>i</sup> : <sup>N</sup> <sup>→</sup> <sup>N</sup>, where <sup>n</sup> is the size of the domain and <sup>1</sup> <sup>≤</sup> <sup>i</sup> <sup>≤</sup> . We demand that all functions f<sup>i</sup> are first-order definable from + and ×.

The effect of such an update rule after a change operation δ(¯a) is as follows: the new relation RA- in the updated auxiliary structure A contains all tuples from R<sup>A</sup> that are *not* equal to (t1(¯a; ¯b),...,tk(¯a; ¯b)) for any tuple ¯b that satisfies the constraints C; and additionally RA- contains all tuples (t1(¯a; ¯b),...,tk(¯a; ¯b)) such that ¯<sup>b</sup> satisfies <sup>C</sup> and A |<sup>=</sup> <sup>ϕ</sup><sup>R</sup> <sup>δ</sup> (¯a; ¯b) holds.

Phrased more operationally, an update is performed by enumerating all tuples ¯b that satisfy C, evaluating ϕ<sup>R</sup> <sup>δ</sup> (¯a; ¯b) on the old auxiliary structure <sup>A</sup>, and depending on the result adding the tuple (t1(¯a; ¯b),...,tk(¯a; ¯b)) to R (if it was not already present), or removing that tuple from R (if it was present).

Update rules for auxiliary functions are similar, but instead of an update formula that decides whether a tuple of the form (t1(¯a; ¯b),...,tk(¯a; ¯b)) is contained in the updated relation, it features an update term that determines the new function value for a function argument of the form (t1(¯a; ¯b),...,tk(¯a; ¯b)).

We say that P is a dynamic program for a dynamic problem Π if it supports all its operations and, in particular, always yields correct results for query operations. More precisely, if the result of applying a query operation after a sequence α of change operations on an initial structure I<sup>0</sup> yields the same result as the evaluation of the query rule on the auxiliary structure that is obtained by applying the update rules corresponding to the change operations in α to an initial auxiliary structure A0. Here, an initial input structure I<sup>0</sup> over some domain D is *empty*, that is, it is a structure with empty relations and with all function values being undefined (⊥). The initial auxiliary structure A<sup>0</sup> is over the same domain D as I<sup>0</sup> and is defined from I<sup>0</sup> by some FO-definable initialization.

By DynFO, we denote the class of all dynamic problems that have a dynamic program in the sense we just defined.

#### 3.2 A syntax for work-efficient dynamic programs

In this paper we are particularly interested in dynamic programs that require little work to update the auxiliary structure after every change operation and to compute the result of a query operation. However, since dynamic programs do not come with an execution model, there is no direct way to define, say, when a DynFO-programs has polylogarithmic-work, syntactically.

We follow a pragmatic approach here. We define a pseudo-code-based syntax for *update* and *query procedures* that will be used in place of the update and query *formulas* in rules of dynamic programs. This syntax has three important properties: (1) it is reasonably well readable (as opposed to strict first-order logic formulas), (2) it allows a straightforward translation of rules into proper DynFO-programs, and (3) it allows to associate a "work-bounding function" to each rule and to translate it into a PRAM program with O(1) parallel time and work bounded by this function.

The syntax of the pseudo-code has similarities with Abstract State Machines [4] and the PRAM-syntax of [16]. For simplicity, we describe a minimal set of syntactic elements that suffice for the dynamic programs in this paper. We encourage readers to have a look at Section 4 for examples of update rules with pseudo-code syntax.

We only spell out a syntax for *update procedures* that can be used in place of the update formula ϕ<sup>R</sup> <sup>δ</sup> (¯p; ¯x) of an update rule

#### on change δ(¯p) update R at (t1(¯p; ¯x),...,tk(¯p; ¯x)) as ϕ<sup>R</sup> <sup>δ</sup> (¯p; ¯x) where C(¯x).

Query procedures are defined similarly, but they can not invoke any change operations for supplementary instances, and their only free variables are from p¯.

We allow some compositionality: a dynamic program on some *main instance* can use *supplementary instances* of other dynamic problems and invoke change or query operations of other dynamic programs on those instances. These supplementary instances are declared on a global level of the dynamic program and each has an associated identifier.

Update procedures P = P1; P<sup>2</sup> consist of two parts. In the *initial procedure* P<sup>1</sup> no reference to the free variables from x¯ are allowed, but change operations for supplementary instances can be invoked. We require that, for each change operation δ of the main instance and each supplementary instance S, at most one update rule for δ invokes change operations for S.

In the *main procedure* P2, no change operations for supplementary instances can be invoked, but references to x¯ are allowed.

More precisely, both P<sup>1</sup> and P<sup>2</sup> can use (a series of) instructions of the following forms:


Semantically, here and in the following n always refers to the size of the domain of the main instance. The initial procedure P<sup>1</sup> can further use change invocations instance.δ(¯y). However, they are not allowed in the scope of parallel branches. And we recall that in P<sup>1</sup> no variables from x¯ can be used.

The main procedure P<sup>2</sup> can further use return statements return condition or return term, but not inside parallel branches.

Of course, initial procedures can only have initial procedures P and P in conditional and parallel branches, and analogously for main procedures.

Conditions and terms are defined as follows. In all cases, y¯ denotes a tuple of terms and z is a *local variable*, not occurring in p¯ or x¯. In general, a *term* evaluates to a domain element (or to ⊥). It is built from


For the latter expression it is required that there is always exactly one domain element a ≤ g(n) satisfying condition.

A *condition* evaluates to true or false. It may be


All functions <sup>g</sup> : <sup>N</sup> <sup>→</sup> <sup>N</sup> in these definitions are required to be FO-definable. For assignments of relations R and functions f we demand that these symbols do *not* appear in τaux. If an assignment with a head f(¯y) or R(¯y) occurs in the scope of a parallel branch that binds variable z, then z has to occur as a term y<sup>i</sup> in y¯. We further demand that update procedures are well-formed, in the sense that every execution path ends with a return statement of appropriate type.

In our pseudo-code algorithms, we display update procedures P = P1; P<sup>2</sup> with initial procedure P<sup>1</sup> and main procedure P<sup>2</sup> as

```
on change δ(¯p) with P1
   update R at (t1(¯p, x¯),...,tk(¯p, x¯)), for all C(¯x), by: P2.
```
to emphasise that P<sup>1</sup> only needs to be evaluated once for the update of R, and not once for every different value of x¯.

In a nutshell, the semantics of an update rule

$$\text{con change } \delta(\bar{p}) \text{ update } R \text{ at } (t\_1(\bar{p}; \bar{x}), \dots, t\_k(\bar{p}; \bar{x})) \text{ as } P \text{ where } C(\bar{x})$$

is defined as in Subsection 3.1, but A |<sup>=</sup> <sup>ϕ</sup><sup>R</sup> <sup>δ</sup> (¯a, ¯b) has to be replaced by the condition that <sup>P</sup> returns true under the assignment (¯<sup>p</sup> <sup>→</sup> <sup>a</sup>¯; ¯<sup>x</sup> <sup>→</sup> ¯b).

For update rules for auxiliary functions, P returns the new function value instead of a Boolean value.

Since P<sup>1</sup> is independent of x¯, in the semantics, it is only evaluated once. In particular, any change invocations are triggered only once.

With Procedural-DynFO-programs we refer to the above class of dynamic update programs. Here and later we will introduce abbreviations as syntactic sugar, for example the sequential loop for <sup>z</sup> <sup>≤</sup> <sup>m</sup> do <sup>P</sup>, where <sup>m</sup> <sup>∈</sup> <sup>N</sup> needs to be a fixed natural number.

We show next that update and query procedures can be translated into constant-time CRAM programs. Since the latter can be translated into FOformulas [14, Theorem 5.2], therefore Procedural-DynFO-programs can be translated in DynFO-programs.

#### 3.3 Implementing Procedural-DynFO-programs as PRAMs

We use *Parallel Random Access Machines* (PRAMs) as the computational model to measure the work of our dynamic programs. A PRAM consists of a number of processors that work in parallel and use a shared memory. We only consider *CRAMs*, a special case of Concurrent-Read Concurrent-Write model (CRCW PRAM), i.e. processors are allowed to read and write concurrently from and to the same memory location, but if multiple processors concurrently write the same memory location, then all of them need to write the same value. For an input of size n we denote the *time* that a PRAM algorithm needs to compute the solution as T(n). The *work* W(n) of a PRAM algorithm is the sum of the number of all computation steps of all processors made during the computation. For further details we refer to [14,16].

It is easy to see that Procedural-DynFO programs P can be translated into O(1)-time CRAM-programs C. To be able to make a statement about (an upper bound of) the work of C, in the full version of this paper we associate a function w with update rules and show that every update rule π can be implemented by a O(1)-time CRAM-program with work O(w). Likewise for query rules.

In a nutshell, the work of an update procedure mainly depends on the scopes of the (nested) parallel branches and the amount of work needed to query and update the supplementary instances. The work of a whole update rule is then determined by adding the work of the initial procedure once and adding the work of the main procedure for each tuple that satisfies the constraint of the update rule.

### 4 A simple work-efficient Dynamic Program

In this section we consider a simple dynamic problem with a fairly work-efficient dynamic program. It serves as an example for our framework but will also be used as a subroutine in later sections.

The dynamic problem is to maintain a subset K of an ordered set D of elements under insertion and removal of elements in K, allowing for navigation from an element of D to the next larger and smaller element in K. That is, we consider the following dynamic problem:

```
Problem: NextInK
```

```
Input: A set K ⊆ D with canonical linear order ≤ on D
```
	- del(i): Deletes <sup>i</sup> <sup>∈</sup> <sup>D</sup> from <sup>K</sup>

For the smallest (largest) element the result of a pred (succ) query is undefined, i.e. ⊥. For simplicity, we assume in the following that D is always of the form [n], for some <sup>n</sup> <sup>∈</sup> <sup>N</sup>.

Sequentially, the changes and queries of NextInK can be handled in sequential time O(log log n) [9]. Here we show that the problem also has a dynamic program with parallel time O(1) and work O(log n).

Lemma 4.1. *There is a* DynFO*-program for* NextInK with <sup>O</sup>(log <sup>n</sup>) work per change and query operation.

*Proof.* The dynamic program uses an ordered binary balanced tree T with leave set [n], and with 0 as its leftmost leaf. Each inner node v represents the interval S(v) of numbers labelling the leafs of the subtree of v. To traverse the tree, the dynamic program uses functions 1st and 2nd that map an inner node to its first or second child, respectively, and a function anc(v, j) that returns the j-th ancestor of v in the tree.<sup>7</sup> So, anc(v, 2) returns the parent of the parent of v.

The functions 1st, 2nd and anc are static, that is, they are initialized beforehand and not affected by change operations.

<sup>7</sup> Formally, the <sup>2</sup>|D<sup>|</sup> nodes of <sup>T</sup> can be represented by pairs (a, b) of elements. In our presentation, we disregard these technical issues and use nodes of T just as domain elements.

#### Algorithm 1 Querying a successor.




The idea of the dynamic program is to maintain, for each node v, the maximal and minimal element in K ∩ S(v) (which is undefined if K ∩ S(v) = ∅), by maintaining two functions min and max. It is easy to see that this information can be updated and queries be answered in O(log n) time as the tree has depth O(log n). For achieving O(log n) work and constant time, we need to have a closer look.

Using min and max, it is easy to determine the K-successor of an element i ∈ D: if v is the lowest ancestor of i with max(v) > i, then the K-successor of i is min(w) for the second child w def = 2nd(v) of v. Algorithm 1 shows a query rule for the query operation succ(i). The update of these functions is easy when an element i is inserted into K. This is spelled out for min in Algorithm 2. The dynamic program only needs to check if the new element becomes the minimal element in S(v), for every node v that is an ancestor of the leaf i.

Algorithm 3 shows how min can be updated if an element i is deleted from K: if i is the minimal element of K in S(v), for some node v, then min(v) needs to be replaced by its K-successor, assuming it is in S(v).

It is easy to verify the claimed work upper bounds for P. Querying a successor or predecessor via Algorithm 1 needs O(log n) work, since Line 6 requires O(log n) and all others require O(1) work. For maintaining the function min the programs in Algorithms 2 and 3 update the value of log n tuples, but the work per tuple is constant. In the case of a deletion, Line 3 requires O(log n) work but is executed only once. The remaining part consists of O(log n) parallel executions of statements, each with O(1) work.

The handling of max and its work analysis is analogous.

Algorithm 3 Updating min after a deletion.


#### 5 Regular Languages

In this section, we show that the range problem can be maintained with o(n) work for all regular languages and with polylogarithmic work for star-free languages. For the former we show how to reduce the work of a known DynFO-program. For the latter we translate the idea of [9] for maintaining the range problem for star-free languages in O(log log n) sequential time into a dynamic program with O(1) parallel time.

#### 5.1 DynFO-programs with sublinear work for regular languages

Theorem 5.1. *Let* L *be a regular language. Then* RangeMember(L) *can be maintained in DynFO with work* <sup>O</sup>(n) *per query and change operation, for every* > 0*.*

The proof of this theorem makes use of the algebraic view of regular languages. For readers not familiar with this view, the basic idea is as follows: for a fixed DFA A = (Q, Σ, δ, q0, F), we first associate with each string w a function <sup>f</sup><sup>w</sup> on <sup>Q</sup> that is induced by the behaviour of <sup>A</sup> on <sup>w</sup> via <sup>f</sup>w(q) def = δ∗(q, w), where δ<sup>∗</sup> is the extension of the transition function δ to strings. The set of all functions f : Q → Q with composition as binary operation is a *monoid*, that is, a structure with an associative binary operation ◦ and a neutral element, the identity function. Thus, composing the effect of A on subsequent substrings of a string corresponds to multiplication of the monoid elements associated with these substrings. The *syntactic monoid* M(L) of a regular language L is basically the monoid associated with its minimal automaton.

It is thus clear that, for the dynamic problem RangeMember(L) where L is regular, a dynamic program can be easily obtained from a dynamic program for the dynamic problem RangeEval(M(L)), where RangeEval(M), for finite monoids M, is defined as follows.<sup>8</sup>

<sup>8</sup> We note that, unlike for words, each position always carries a monoid element. However, the empty string of the word case corresponds to the neutral element in

Problem: RangeEval(M)

Input: A sequence m<sup>0</sup> ...m<sup>n</sup>−<sup>1</sup> of monoid elements m<sup>i</sup> ∈ M Changes: setm(i) for <sup>m</sup> <sup>∈</sup> <sup>M</sup>: Replaces <sup>m</sup><sup>i</sup> by <sup>m</sup> Queries: range( , r): m ◦···◦ m<sup>r</sup>

For the proof of Theorem 5.1 we do not need any insights into monoid theory. However, when studying languages definable by first-order formulas in Theorem 5.3 below, we will make use of a known decomposition result.

From the discussion above it is now clear that in order to prove Theorem 5.1, it suffices to prove the following result.

Proposition 5.2. *Let* M *be a finite monoid. For every* > 0*,* RangeEval(M) *can be maintained in DynFO with work* <sup>O</sup>(n) *per query and change operation.*

*Proof sketch.* In [11], it was (implicitly) shown that RangeMember(L) is in DynProp (that is, quantifier-free DynFO), for regular languages L. The idea was to maintain the effect of a DFA for L on w[ , r], for each interval ( , r) of positions. This approach can be easily used for RangeEval(M) as well, but it requires a quadratic number of updates after a change operation, in the worst case.

We adapt this approach and only store the effect of the DFA for <sup>O</sup>(n) intervals, by considering a hierarchy of intervals of bounded depth.

The first level in the hierarchy of intervals is obtained by decomposing the input sequence into intervals of length t, for a carefully chosen t. We call these intervals *base intervals* of height 1 and their subintervals *special intervals* of height 1. The latter are *special* in the sense that they are exactly the intervals for which the dynamic program maintaines the product of monoid elements. In particular, each base interval of height 1 gives rise to O(t <sup>2</sup>) special intervals of height 1. The second level of the hierarchy is obtained by decomposing the sequence of base intervals of height 1 into sequences of length t. Each such sequence of length t is combined to one base interval of height 2; and each contiguous subsequence of such a sequence is combined to one special interval of height 2. Again, each base interval of height 2 gives rise to O(t <sup>2</sup>) special intervals of height 2. This process is continued recursively for the higher levels of the hierarchy, until only one base interval of height h remains. We refer to Figure 1 for an illustration of this construction.

The splitting factor t is chosen in dependence of n and such that the height of this hierarchy of special intervals only depends on and is thus constant for all n. More precisely, we fix λ def = <sup>2</sup> and <sup>t</sup> def = n<sup>λ</sup>. Therefore, h = logt(n) = <sup>1</sup> λ .

The idea for the dynamic program is to store the product of monoid elements for each special interval. The two crucial observations are then, that (1) the product of each (not necessary special) interval can be computed with the help of a constant number of special intervals, and (2) that each change operation affects at most t <sup>2</sup> special intervals per level of the hierarchy and thus at most ht<sup>2</sup> ∈ O(n) special intervals in total. We refer to the full version for more details.

the monoid case. In particular, the initial "empty" sequence consists of n copies of the neutral element.

Fig. 1. Illustration of special intervals, for t = 3. The special intervals of level 3 are [0, 9), [9, 18), [18, 27), [0, 18) and [9, 27) with base interval [0, 27). The result of a query range(2, 22) can be computed as <sup>22</sup> i=2 <sup>m</sup><sup>i</sup> <sup>=</sup> m[2, 3) ◦ m[3, 9) ◦ <sup>m</sup>[9, 18) ◦ m[18, 21)◦m[21, 23) , illustrated above in blue. The affected base intervals for a change at position 23 are marked in red. E.g., the new product m- [18, 27) can be computed by m- [18, 27) = m[18, 21) ◦ m- [21, 24) ◦ m[24, 27). As the products are recomputed bottom up, m- [21, 24) is already updated.

#### 5.2 DynFO-programs with polylogarithmic work for star-free languages

Although the work bound of Theorem 5.1 for regular languages is strongly sublinear, one might aim for an even more work-efficient dynamic program, especially, since RangeMember(L) can be maintained *sequentially* with logarithmic update time for regular languages [9]. We leave it as an open problem whether for every regular language L there is a DynFO-program for RangeMember(L) with a polylogarithmic work bound. However, we show next that such programs exist for star-free regular languages, in fact they even have a logarithmic work bound. The star-free languages are those that can be expressed by regular expressions that do not use the Kleene star operator but can use complementation.

Theorem 5.3. *Let* L *be a star-free regular language. Then* RangeMember(L) *can be maintained in* DynFO *with work* O(log n) *per query and change operation.*

It is well-known that star-free regular languages are just the regular languages that can be defined in first-order logic (without arithmetic!) [18]. Readers might ask why we consider dynamic first-order maintainability of a problem that can actually be *expressed* in first-order logic. The key point is the parallel work here: even though the membership problem for star-free languages can be solved by a parallel algorithm in time O(1), it inherently requires parallel work Ω(n).

*Proof sketch.* The proof uses the well-known connection between star-free languages and group-free monoids (see, e.g., [25, Chapter V.3] and [25, Theorem V.3.2]). It thus follows the approach of [9].

In a nutshell, our dynamic program simply implements the algorithms of the proof of Theorem 2.4.2 in [9]. Those algorithms consist of a constantly bounded number of simple operations and a constantly bounded number of searches for a next neighbour in a set. Since the latter can be done in DynFO with work O(log n) thanks to Lemma 4.1, we get the desired result for group-free monoids and then for star-free languages. We refer to the full version for more details.

#### 6 Context Free Languages

As we have seen in Section 5, range queries to regular languages can be maintained in DynFO with strongly sublinear work. An immediate question is whether context-free languages are equally well-behaved. Already the initial paper by Patnaik and Immerman showed that DynFO can maintain the membership problem for *Dyck languages* Dk, for k ≥ 1, that is, the languages of well-balanced parentheses expressions with k types of parentheses [20]. It was shown afterwards in [11, Theorem 4.1] that DynFO actually captures the membership problem for all context-free languages and that Dyck languages even do not require quantifiers in formulas (but functions in the auxiliary structure) [11, Proposition 4.4]. These results can easily be seen to apply to range queries as well. However, the dynamic program of [11, Theorem 4.1] uses 4-ary relations and three nested existential quantifiers, yielding work in the order of n<sup>7</sup>.

In the following, we show that the membership problem for context-free languages is likely *not* solvable in DynFO with sublinear work, but that the Dyck language D<sup>1</sup> with one bracket type can be handled with polylogarithmic work for the membership problem and work <sup>O</sup>(n) for the range problem, and that for other Dyck languages these bounds hold with an additional linear factor n.

#### 6.1 A conditional lower bound for context-free languages

Our conditional lower bound for context-free languages is based on a result from Abboud et al. [2] and the simple observation that the word problem for a language L can be solved, given a dynamic program for its membership problem.

Lemma 6.1. *Let* L *be a language. If* Member(L) *can be maintained in* DynFO *with work* f(n)*, then the word problem for* L *can be decided sequentially in time* O(n · f(n))*.*

The announced lower bound is relative to the following conjecture [1].

*Conjecture 6.2 (*k*-Clique conjecture).* For any > 0, and k ≥ 3, k-Clique has no algorithm with time bound <sup>O</sup>(n(1−) <sup>ω</sup> <sup>3</sup> <sup>k</sup>).

Here, ω is the matrix multiplication exponent [10,27], which is known to be smaller than 2.373 and believed to be exactly two [10,27].

In [2], the word problem for context-free languages was linked to the k-Clique problem as follows.

Theorem 6.3 ([2, Theorem 1.1]). *There is a context free grammar* G *such that, if the word problem for* L(G) *can be solved in time* T(n)*,* k*-Clique can be solved on* <sup>n</sup> *node graphs in* <sup>O</sup>(T(<sup>n</sup> <sup>k</sup> <sup>3</sup> +1)) *time, for any* <sup>k</sup> <sup>≥</sup> <sup>3</sup>*.*

Putting Lemma 6.1 and Theorem 6.3 together, we get the following result.

Theorem 6.4. *There is a context free grammar* G *such that, if the membership problem for* <sup>L</sup>(G) *can be solved by a* DynFO*-program with work* <sup>O</sup>(n<sup>ω</sup>−1−)*, for some* > 0*, then the* k*-Clique conjecture fails.*

The simple proofs of Lemma 6.1 and Theorem 6.4 are presented in the full version.

Thus, we can not reasonably expect any DynFO-programs for general contextfree languages with considerable less work than <sup>O</sup>(n<sup>1</sup>.<sup>37</sup>) barring any breakthroughs for matrix multiplication. In fact, for "combinatorial DynFO-programs", an analogous reasoning yields a work lower bound of <sup>O</sup>(n<sup>2</sup>−).

#### 6.2 On work-efficient dynamic programs for Dyck languages

We next turn to Dyck languages. Clearly, all Dyck languages are deterministic context-free, their word problem can therefore be solved in linear time, and thus the lower bound approach of the previous subsection does not work for them. We first present the DynFO-program with polylogarithmic work for the membership problem of D1. It basically mimics the sequential algorithm from [8] that maintains D<sup>1</sup> sequentially in time O(log n), per change and query operation.

Theorem 6.5. Member(D1) *can be maintained in* DynFO *with* <sup>O</sup>((log <sup>n</sup>)<sup>3</sup>) *work.*

*Proof sketch.* Let Σ<sup>1</sup> = {,} be the alphabet underlying D1. The dynamic program uses an ordered binary tree T such that each leaf corresponds to one position from left-to right. A parent node corresponds to the set of positions of its children. We assume for simplicity that the domain is [n], for some number n that is a power of 2. In a nutshell, the program maintains for each node x of T the numbers (x) and r(x) that represent the number of unmatched closing and unmatched opening brackets of the string str(x) corresponding to x via the leaves of the induced subtree at x. E.g., if that string is  for x, then (x)=2 and r(x)=1. The overall string w is in D<sup>1</sup> exactly if r(root) = (root)=0.

In the algorithm of [8], the functions and r are updated in a bottom-up fashion. However, we will observe that they do not need to be updated sequentially in that fashion, but can be updated in parallel constant time. In the following, we describe how P can update (x) and r(x) for all ancestor nodes x of a position p, after a closing parenthesis  was inserted at p. Maintaining and r for the other change operations is analogous.

There are two types of effects that an insertion of a closing parenthesis could have on x: either (x) is increased by one and r(x) remains unchanged, or r(x) is decreased by one and (x) remains unchanged. We denote these effects by the pairs (+1, 0) and (0, −1), respectively.

Table 1 shows how the effect of a change at a position p below a node x with children y<sup>1</sup> and y<sup>2</sup> relates to the effect at the affected child. This depends on whether r(y1) ≤ (y2) and whether the affected child is y<sup>1</sup> or y2. A closer inspection of Table 1 reveals a crucial observation: in the upper left and the lower right field of the table, the effect on x is *independent* of the effect on the child (being it y<sup>1</sup> or y2). That is, these cases induce an effect on x independent of the children. We thus call these cases *effect-inducing*. In the other two fields, the


Table 1. The effect on x after a closing parenthesis was inserted at position p. The effects depend on the effect on the children y<sup>1</sup> and y<sup>2</sup> of x: for example, an entry '(0, −1) → (+1, 0)' in the column 'p is in str(y1)' means that if the change operation has effect (0, −1) on y<sup>1</sup> then the change operation has effect (+1, 0) on x.

effect on x depends on the effect at the child, but in the simplest possible way: they are just the same. That is the effect at the child is just adopted by x. We call these cases *effect-preserving*. To determine the effect at x it is thus sufficient to identify the highest affected descendant node z of x, where an effect-inducing case applies, such that for all intermediate nodes between x and z only effectpreserving cases apply.

Our dynamic program implements this idea. First it determines, for each ancestor x of the change position p, whether it is effect-inducing and which effect is induced. Then it identifies, for each x, the node z (represented by its height i above p) as the unique effect-inducing node that has no effect-inducing node on its path to <sup>x</sup>. The node <sup>z</sup> can be identified with work <sup>O</sup>((log <sup>n</sup>)<sup>2</sup>), as z is one of at most log n many nodes on the path from x to the leaf of p, and one needs to check that all nodes between x and z are effect-preserving. As the auxiliary relations need to be updated for log n many nodes, the overall work of <sup>P</sup> is <sup>O</sup>((log <sup>n</sup>)<sup>3</sup>). We refer to the full version for more details.

A work-efficient dynamic program for range queries for *D***<sup>1</sup>** and *D<sup>k</sup>* Unfortunately, the program of Theorem 6.5 does not support range queries, since it seems that one would need to combine the unmatched parentheses of log n many nodes of the binary tree in the worst case. However, its idea can be combined with the idea of Proposition 5.2, yielding a program that maintains and <sup>r</sup> for <sup>O</sup>(n) special intervals on a constant number of levels.

In fact, this approach even works for D<sup>k</sup> for k > 1. Indeed, with the help of and r, it is possible to identify for each position of an opening parenthesis the position of the corresponding closing parenthesis in O(1) parallel time with work n, and then one only needs to check that they match everywhere. The latter contributes an extra factor O(n) to the work, for k > 1, but can be skipped for k = 1.

Theorem 6.6. *For all* > 0*,* k > 1*,*


*Proof sketch.* In the following we reuse the definition of *special intervals* from the proof of Proposition 5.2 as well as the definition of and r from the proof of Proposition 6.5. We first describe a dynamic program for RangeMember(D1). It maintains and <sup>r</sup> for all special intervals, which is clearly doable with <sup>O</sup>(n) work per change operation. Similar to the proof of Proposition 5.2, the two crucial observations (justified in the full version) are that (1) a range query can be answered with the help of a constant number of special intervals, and (2) the change operation affects only a bounded number of special intervals per level.

As stated before, the program for RangeMember(Dk) also maintains and r, but it should be emphasised that also in the case of several parenthesis types, the definition of these functions ignores the bracket type. With that information it computes, for each opening bracket the position of its matching closing bracket, with the help of and r, and checks that they match. This can be done in parallel and with work <sup>O</sup>(n) per position. We refer to the full version for more details.

Moderately work-efficient dynamic programs for *D<sup>k</sup>* We now turn to the membership query for D<sup>k</sup> with k > 1. Again, our program basically mimics the sequential algorithm from [8] which heavily depends on the dynamic problem StringEquality that asks whether two given strings are equal.

#### Problem: StringEquality

Input: Two Sequences u = u<sup>0</sup> ...u<sup>n</sup>−<sup>1</sup> and v = v<sup>0</sup> ...v<sup>n</sup>−<sup>1</sup> of letters with ui, v<sup>i</sup> ∈ Σ ∪ {} Changes: setx,σ(i) for <sup>σ</sup> <sup>∈</sup> Σ,x ∈ {u, v}: Sets <sup>x</sup><sup>i</sup> to <sup>σ</sup>, if <sup>x</sup><sup>i</sup> <sup>=</sup> resetx(i) for <sup>x</sup> ∈ {u, v}: Sets <sup>x</sup><sup>i</sup> to Queries: equals: Is <sup>u</sup><sup>0</sup> ◦ ... ◦ <sup>u</sup><sup>n</sup>−<sup>1</sup> <sup>=</sup> <sup>v</sup><sup>0</sup> ◦ ... ◦ <sup>v</sup><sup>n</sup>−<sup>1</sup>?

It is easy to show that a linear amount of work is sufficient to maintain StringEquality.

Lemma 6.7. StringEquality *is in* DynFO *with work* <sup>O</sup>(n)*.*

Because of the linear work bound for StringEquality our dynamic program for Member(Dk) also has a linear factor in the work bound.

Theorem 6.8. Member(Dk) *is maintainable in* DynFO *with work* <sup>O</sup>(<sup>n</sup> log <sup>n</sup> <sup>+</sup> (log <sup>n</sup>)<sup>3</sup>) *for every fixed* <sup>k</sup> <sup>∈</sup> <sup>N</sup>*.*

*Proof sketch.* The program can be seen as an extension of the program for Member(D1). As unmatched parentheses are no longer well-defined if we have more than one type of parenthesis the idea of [8] is to maintain the parentheses to the left and right that remain if we reduce the string by matching opening and closing parentheses regardless of their type. To be able to answer Member(Dk), the dynamic program maintains the unmatched parentheses for every node x of a tree spanning the input word, and a bit M(x) that indicates whether the types of the parentheses match properly.

How the unmatched parentheses can be maintained for a node x after a change operation depends on the "segment" of str(x) in which the change happened and in some cases reduces to finding a node z with a local property on the path from x to the leaf that corresponds to the changed position.

To update M(x) for a node x with children y<sup>1</sup> and y<sup>2</sup> the dynamic program compares the unmatched parentheses to the right of y<sup>1</sup> with the ones to the left of <sup>y</sup><sup>2</sup> using StringEquality. We refer to the full version for more details.

Maintaining string equality and membership in D<sup>k</sup> for k > 1 is even closer related which is stated in the following lemma.


#### 7 Conclusion

In this paper we proposed a framework for studying the aspect of work for the dynamic, parallel complexity class DynFO. We established that all regular languages can be maintained in DynFO with <sup>O</sup>(n) work for all > <sup>0</sup>, and even with O(log n) work for star-free regular languages. For context-free languages we argued that it will be hard to achieve work bounds lower than <sup>O</sup>(n<sup>ω</sup>−1−) in general, where ω is the matrix multiplication exponent. For the special case of Dyck languages <sup>D</sup><sup>k</sup> we showed that <sup>O</sup>(<sup>n</sup> · (log <sup>n</sup>)<sup>3</sup>) work suffices, which can be further reduced to <sup>O</sup>(log<sup>3</sup> <sup>n</sup>) work for <sup>D</sup>1. For range queries, dynamic programs with work <sup>O</sup>(n1+) and <sup>O</sup>(n) exist, respectively.

We highlight some research directions. One direction is to improve the upper bounds on work obtained here. For instance, it would be interesting to know whether all regular languages can be maintained with polylog or even O(log n) work and how close the lower bounds for context-free languages can be matched. Finding important subclasses of context-free languages for which polylogarithmic work suffices is another interesting question. Apart from string problems, many DynFO results concern problems on dynamic graphs, especially the reachability query [5]. How large is the work of the proposed dynamic programs, and are more work-efficient dynamic programs possible?

The latter question also leads to another research direction: to establish further lower bounds. The lower bounds obtained here are relative to strong conjectures. Absolute lower bounds are an interesting goal which seems in closer reach than lower bounds for DynFO without bounds on the work.

#### References

1. Abboud, A., Backurs, A., Bringmann, K., Künnemann, M.: Fine-grained complexity of analyzing compressed data: Quantifying improvements over decompress-andsolve. In: Umans, C. (ed.) 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017. pp. 192–203. IEEE Computer Society (2017). https://doi.org/10.1109/FOCS.2017.26


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/ 4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

#### **Learning Pomset Automata***-*

Gerco van Heerdt<sup>1</sup> (-), Tobias Kapp´e<sup>2</sup> , Jurriaan Rot<sup>3</sup>, and Alexandra Silva<sup>1</sup>

 University College London, London, UK gerco.heerdt@ucl.ac.uk Cornell University, Ithaca NY, USA Radboud University, Nijmegen, The Netherlands

**Abstract.** We extend the L algorithm to learn bimonoids recognising pomset languages. We then identify a class of pomset automata that accepts precisely the class of pomset languages recognised by bimonoids and show how to convert between bimonoids and automata.

### **1 Introduction**

Automata learning algorithms are useful in automated inference of models, which is needed for verification of hardware and software systems. In *active* learning, the algorithm interacts with a system through tests and observations to produce a model of the system's behaviour. One of the first active learning algorithms proposed was L , due to Dana Angluin [2], which infers a minimal deterministic automaton for a target regular language. L has been used in a range of verification tasks, including learning error traces in a program [5]. For more advanced verification tasks, richer automata types are needed and L has been extended to e.g. input-output [1], register [20], and weighted automata [16]. None of the existing extensions can be used in analysis of concurrent programs.

Partially ordered multisets (pomsets) [13,12] are basic structures used in the modeling and semantics of concurrent programs. Pomsets generalise words, allowing to capture both the sequential and the parallel structure of a trace in a concurrent program. Automata accepting pomset languages are therefore useful to study the operational semantics of concurrent programs—see, for instance, work on concurrent Kleene algebra [17,26,21,24].

In this paper, we propose an active learning algorithm for a class of pomset automata. The approach is algebraic: we consider languages of pomsets recognised by bimonoids [28] (which we shall refer to as pomset recognisers). This can be thought of as a generalisation of the classical approach to language theory of using monoids as word acceptors: bimonoids have an extra operation that models parallel composition in addition to sequential. The two operations give rise to a complex branching structure that makes the learning process non-trivial.

 This work was partially supported by the ERC Starting Grant ProFoundNet (679127) and the EPSRC Standard Grant CLeVer (EP/S028641/1). The authors thank Matteo Sammartino for useful discussions.

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 510–530, 2021. https://doi.org/10.1007/978-3-030-71995-1 26

The key observation is that pomset recognisers are tree automata whose algebraic structure satisfies additional equations. We extend tree automata learning algorithms [7,8,31] to pomset recognisers. The main challenge is to ensure that intermediate hypotheses in the algorithm are valid pomset recognisers, which is essential in practical scenarios where the learning process might not run to the very end, returning an approximation of the system under learning. This requires equations of bimonoids to be correctly propagated and preserved in the core data structure of the algorithm—the observation table. The proof of termination, in analogy to L , relies on the existence of a canonical pomset recogniser of a language, which is based on its syntactic bimonoid. The steps of the algorithm provide hypotheses that get closer in size to the canonical recogniser.

Finally, we bridge the learning algorithm to pomset automata [21,22] by providing two constructions that enable us to seamlessly move between pomset recognisers and pomset automata. Note that although bimonoids provide a useful formalism to denote pomset languages, which is amenable to the design of the learning algorithm, they enforce a redundancy that is not present in pomset automata: whereas a pomset automaton processes a pomset from left to right in sequence, one letter per branch at a time, a bimonoid needs to be able to take the pomset represented as a binary tree in any way and process it bottom-up. This requirement of different decompositions leading to the same result makes bimonoids in general much larger than pomset automata and hence the latter are, in general, a more efficient representation of a pomset language.

The rest of the paper is organised as follows. We conclude this introductory section with a review of relevant related work. Section 2 contains the basic definitions on pomsets and pomset recognisers. The learning algorithm for pomset recognisers appears in Section 3, including proofs to ensure termination and invariant preservation. Section 4 presents constructions to translate between (a class of) pomset automata and pomset recognisers. We conclude with discussion of further work in Section 5. Omitted proofs appear in the extended version [15].

*Related Work.* There is a rich literature on adaptations and extensions of L from deterministic automata to various kinds of models, see, e.g., [34,18] for an overview. To the best of our knowledge, this paper is the first to provide an active learning algorithm for pomset languages recognised by finite bimonoids.

Our algorithm learns an algebraic recogniser. Urbat and Schr¨oder [33] provide a very general learning approach for languages recognised by algebras for monads [4,32], based on a reduction to categorical automata, for which they present an L -type algorithm. Their reduction gives rise to an infinite alphabet in general, so tailored work is needed for deriving algorithms and finite representations. This can be done for instance for monoids, recognising regular languages, but it is not clear how this could extend to pomset recognisers. We present a direct learning algorithm for bimonoids, which does not rely on any encoding.

Our concrete learning algorithm for bimonoids is closely related to learning approaches for bottom-up tree automata [7,8,31]: pomset languages can be viewed as tree languages satisfying certain equations. Incorporating these equations turned out to be a non-trivial task, which requires additional checks on the observation table during execution of the algorithm.

Conversion between recognisers and automata for a pomset language was first explored by Lodaya and Weil [28,27]. Their results relate the expressive power of these formalisms to *sr-expressions*. As a result, converting between recognisers and automata using their construction uses an sr-expression as an intermediate representation, increasing the resulting state space. Our construction, however, converts recognisers directly to pomset automata, which keeps the state space relatively small. Moreover, Lodaya and Weil work focus on pomset languages of *bounded width*, i.e., with an upper bound on the number of parallel events. In contrast, our conversions work for all recognisable pomset languages (and a suitable class of pomset automata), including those of unbounded width.

Esik and N´emeth [ ´ 9] considered automata and recognisers for *biposets*, i.e., sp-pomsets without commutativity of parallel composition. They equate languages recognised by *bisemigroups* (bimonoids without commutativity or units) with those accepted by *parenthesizing automata*. Our equivalence is similar in structure, but relates a subclass of pomset automata to bimonoids instead. The results in this paper can easily be adapted to learn representations of biposet languages using bisemigroups, and convert those to parenthesizing automata.

### **2 Pomset Recognisers**

Throughout this paper we fix a finite *alphabet* Σ and assume - ∈ Σ. When defining sets parameterised by a set X, say S(X), we may use S to refer to S(Σ).

We recall pomsets [12,13], a generalisation of words that model concurrent traces. A *labelled poset* over X is a tuple **u** = S**u**, ≤**<sup>u</sup>**, λ**<sup>u</sup>**, where S**<sup>u</sup>** is a finite set (the *carrier* of **u**), ≤**<sup>u</sup>** is a partial order on S**<sup>u</sup>** (the *order* of **u**), and λ**<sup>u</sup>** : S**<sup>u</sup>** → X is a function (the *labelling* of **u**). Pomsets are labelled posets up to isomorphism.

**Definition 1 (Pomsets).** *Let* **u**, **v** *be labelled posets over* X*. An* embedding *of* **u** *in* **v** *is an injection* h : S**<sup>u</sup>** → S**<sup>v</sup>** *such that* λ**<sup>v</sup>** ◦ h = λ**<sup>u</sup>** *and* s ≤**<sup>u</sup>** s *if and only if* h(s) ≤**<sup>v</sup>** h(s )*. An* isomorphism *is a bijective embedding whose inverse is also an embedding. We say* **u** *is* isomorphic *to* **v***, denoted* **u** ∼= **v***, if there exists an isomorphism between* **u** *and* **v***. A* pomset *over* X *is an isomorphism class of labelled posets over* X*, i.e.,* [**v**] = {**u** : **u** ∼= **v**}*. When* u = [**u**] *and* v = [**v**] *are pomsets,* u *is a* subpomset *of* v *when there exists an embedding of* **u** *in* **v***.*

When two pomsets are in scope, we tacitly assume that they are represented by labelled posets with disjoint carriers. We write 1 for the empty pomset. When a ∈ X, we write a for the pomset represented by the labelled poset whose sole element is labelled by a. Pomsets can be composed in sequence and in parallel:

**Definition 2 (Pomset composition).** *Let* u = [**u**] *and* v = [**v**] *be pomsets over* X*. We write* u v *for the* parallel composition *of* u *and* v*, which is the pomset over* X *represented by the labelled poset*

$$\|\mathbf{u}\|\|\mathbf{v} = \langle S\_{\mathbf{u}} \cup S\_{\mathbf{v}}, \ \leq\_{\mathbf{u}} \cup \leq\_{\mathbf{v}}, \ \lambda\_{\mathbf{u}} \cup \lambda\_{\mathbf{v}}\rangle$$

*Similarly, we write* u · v *for the* sequential composition *of* u *and* v*, that is, the pomset represented by the labelled poset*

$$\mathbf{u} \cdot \mathbf{v} = \langle S\_{\mathbf{u}} \cup S\_{\mathbf{v}}, \ \leq\_{\mathbf{u}} \cup \leq\_{\mathbf{v}} \cup S\_{\mathbf{u}} \times S\_{\mathbf{v}}, \ \ \lambda\_{\mathbf{u}} \cup \lambda\_{\mathbf{v}} \rangle$$

*We may elide the dot for sequential composition, for instance writing* ab *for* a·b*.*

The pomsets we use can be built using sequential and parallel composition.

**Definition 3 (Series-parallel pomsets).** *The set of* series-parallel pomsets (sp-pomsets) *over* X*, denoted* SP(X)*, is the smallest set such that* 1 ∈ SP(X) *and* a ∈ SP(X) *for every* a ∈ X*, closed under parallel and sequential composition.*

Concurrent systems admit executions of operations that are not only ordered in sequence but also allow parallel branches. An algebraic structure consisting of both a sequential and a parallel composition operation, with a shared unit, is called a *bimonoid*. Formally, its definition is as follows.

# **Definition 4 (Bimonoid).** *<sup>A</sup>* bimonoid *is a tuple* M, <sup>5</sup>, , **<sup>1</sup>** *where*


*Bimonoid homomorphisms are defined in the usual way.*

Given a set X, the *free bimonoid* [12] over X is SP(X), ·, , 1. The fact that it is free means that for every function f : X → M for a given bimonoid M, <sup>5</sup>, , **<sup>1</sup>**<sup>M</sup> there exists a unique bimonoid homomorphism <sup>f</sup> : SP(X) <sup>→</sup> <sup>M</sup> such that the restriction of f to X is f.

Just as monoids can recognise words, bimonoids can recognise pomsets [28]. A bimonoid together with the witnesses of recognition is a *pomset recogniser*.

**Definition 5 (Pomset recogniser).** *A* pomset recogniser *is a tuple* R = M, <sup>5</sup>, , **<sup>1</sup>**, i, F *where* M, <sup>5</sup>, , **<sup>1</sup>** *is a bimonoid,* <sup>i</sup>: <sup>Σ</sup> <sup>→</sup> <sup>M</sup>*, and* <sup>F</sup> <sup>⊆</sup> <sup>M</sup>*. The* language recognised *by* R *is given by* L<sup>R</sup> = {u ∈ SP : i (u) <sup>∈</sup> <sup>F</sup>} ⊆ SP.

*Example 6.* Suppose a program consists of a loop, where each iteration runs actions a and b in parallel. We can describe the behaviour of this program by

$$\mathcal{L} = \{ \mathbf{a} \mid \mathbf{b} \}^\* = \{ 1, \mathbf{a} \parallel \mathbf{b}, (\mathbf{a} \parallel \mathbf{b}) \cdot (\mathbf{a} \parallel \mathbf{b}), \dots \}$$

We can describe this language using a pomset recogniser, as follows. Let <sup>M</sup> <sup>=</sup> {qa, qb, q1, q⊥, **<sup>1</sup>**}, and let <sup>5</sup> and be the operations on <sup>M</sup> given by

$$q \odot q' = \begin{cases} q & q'=\mathbf{1} \\ q' & q=\mathbf{1} \\ q\_1 & q=q'=q\_1 \\ q\_\perp & \text{otherwise} \end{cases} \qquad q \odot q' = \begin{cases} q & q'=\mathbf{1} \\ q' & q=\mathbf{1} \\ q\_1 & \{q,q'\} = \{q\_\mathbf{a},q\_\mathbf{b}\} \\ q\_\perp & \text{otherwise} \end{cases}$$

A straightforward proof verifies that M, <sup>5</sup>, , **<sup>1</sup>** is a bimonoid. We set i(a) = qa, i(b) = qb, and F = {**1**, q1}. Now, for n > 0:

$$i^{\sharp}(\underbrace{(\mathbf{a} \parallel \mathbf{b}) \cdots (\mathbf{a} \parallel \mathbf{b})}\_{n \text{ times}}) = \underbrace{(i(\mathbf{a}) \parallel i(\mathbf{b})) \odot \cdots \odot (i(\mathbf{a}) \parallel i(\mathbf{b}))}\_{n \text{ times}} = \underbrace{q\_1 \odot \cdots \odot q\_1}\_{n \text{ times}} = q\_1$$

No other pomsets are mapped to <sup>q</sup>1; hence, M, <sup>5</sup>, , **<sup>1</sup>**, i, F accepts <sup>L</sup>.

*Example 7.* Suppose a program solves a problem recursively, such that the recursive calls are performed in parallel. In that case, the program would either perform the base action b, or some preprocessing action a followed by running two copies of itself in parallel. This behaviour can be described by the smallest pomset language L satisfying the following inference rules:

$$\begin{array}{c} \begin{array}{c} \begin{array}{c} u,v \in \mathcal{L} \\ \hline \end{array} \\ \end{array} \end{array} \qquad \begin{array}{c} u,v \in \mathcal{L} \\ \begin{array}{c} u \cdot (u \parallel v) \in \mathcal{L} \\ \end{array} \end{array}$$

This language can be described by a pomset recogniser. Let our carrier set be <sup>M</sup> <sup>=</sup> {qa, qb, q1, q⊥, **<sup>1</sup>**}, and let <sup>5</sup> and be the operations on <sup>M</sup> given by

$$q \odot q' = \begin{cases} q & q'=\mathbf{1} \\ q' & q=\mathbf{1} \\ q\_\mathbf{b} & q=q\_\mathbf{a}, q'=q\_1 \\ q\_\perp & \text{otherwise} \end{cases} \qquad q \odot q' = \begin{cases} q & q'=\mathbf{1} \\ q' & q=\mathbf{1} \\ q\_1 & q=q'=q\_\mathbf{b} \\ q\_\perp & \text{otherwise} \end{cases}$$

M, <sup>5</sup>, , **<sup>1</sup>** is a bimonoid, <sup>F</sup> <sup>=</sup> {qb}, and <sup>i</sup>: <sup>Σ</sup> <sup>→</sup> <sup>M</sup> is given by setting <sup>i</sup>(a) = <sup>q</sup><sup>a</sup> and <sup>i</sup>(b) = <sup>q</sup>b. One can then show that M, <sup>5</sup>, , **<sup>1</sup>**, i, F accepts <sup>L</sup>.

*Pomset contexts* are used to describe the behaviour of individual elements in a pomset recogniser. Formally, the set of pomset contexts over a set X is given by PC(X) = SP(<sup>X</sup> ∪ {-}). Here the element acts as a placeholder, where a pomset can be plugged in: given a context c ∈ PC(X) and t ∈ SP(X), let <sup>c</sup>[t] <sup>∈</sup> SP(X) be obtained by substituting <sup>t</sup> for in c.

#### **3 Learning Pomset Recognisers**

In this section we present our algorithm to learn pomset recognisers from an oracle (*the teacher* ) that answers *membership* and *equivalence queries*. A membership query consists of a pomset, to which the teacher replies whether that pomset is in the language; an equivalence query consists of a *hypothesis* pomset recogniser, to which the teacher replies *yes* if it is correct or *no* with a counterexample—a pomset incorrectly classified by the hypothesis—if it is not.

A pomset recogniser is essentially a tree automaton, with the additional constraint that its algebraic structure satisfies the bimonoid axioms. Our algorithm is therefore relatively close to tree automata learning—in particular Drewes and H¨ogberg [7,8]—but there are several key differences: we optimise the algorithm by taking advantage of the bimonoid axioms, and at the same time need to ensure that the hypotheses generated by the learning process satisfy those axioms.

#### **3.1 Observation Table**

We fix a target language L ⊆ SP throughout this section. As in the original <sup>L</sup> algorithm, the state of the learner throughout a run of the algorithm is given by a data structure called the *observation table*, which collects information about L. The table contains rows indexed by pomsets, representing the state reached by the correct pomset recogniser after reading that pomset; and columns indexed by pomset contexts, used to approximately indentify the behaviour of each state. To represent the additional rows needed to approximate the pomset recogniser structure, we use the following definition. Given U ⊆ SP, we define

$$U^{+} = \Sigma \cup \{u \cdot v : u, v \in U\} \cup \{u \parallel v : u, v \in U\} \subseteq \mathbb{S}\mathbb{P}.$$

**Definition 8 (Observation table).** *An* observation table *is a pair* S, E*, with* <sup>S</sup> <sup>⊆</sup> SP *subpomset-closed and* <sup>E</sup> <sup>⊆</sup> PC *such that* <sup>1</sup> <sup>∈</sup> <sup>S</sup> *and* - ∈ E*. These sets induce the function* rowS,E : <sup>S</sup> <sup>∪</sup> <sup>S</sup><sup>+</sup> <sup>→</sup> <sup>2</sup><sup>E</sup>*:* rowS,E (s)(e)=1 ⇐⇒ <sup>e</sup>[s] ∈ L. *We often write* row *instead of* rowS,E *when* S *and* E *are clear from the context.*

We depict observation tables, or more precisely row, as two separate tables with rows in <sup>S</sup> and <sup>S</sup><sup>+</sup> \ <sup>S</sup> respectively, see for instance Example <sup>9</sup> below.

The goal of the learner is to extract a *hypothesis* pomset recogniser from the rows in the table. More specifically, the carrier of the underlying bimonoid of the hypothesis will be given by the rows indexed by pomsets in S. The structure on the rows is obtained by transferring the structure of the row labels onto the rows (e.g., row(s) 5 row(t) = row(s · t)), but this is not well-defined unless the table satisfies *closedness*, *consistency*, and *associativity*. Closedness and consistency are standard in L , whereas associativity is a new property specific to bimonoid learning. We discuss each of these properties next, also including *compatibility*, a property that is used to show minimality of hypotheses.

The first potential issue is a closedness defect: this is the case when a composed row, indexed by an element of S<sup>+</sup>, is not indexed by a pomset in S.

*Example 9 (Table not closed).* Recall <sup>L</sup> <sup>=</sup> {<sup>a</sup> # <sup>b</sup>}<sup>∗</sup> from Example 6, and suppose <sup>S</sup> <sup>=</sup> {1, <sup>a</sup>, <sup>b</sup>} and <sup>E</sup> <sup>=</sup> {-, <sup>a</sup> # -, -# b}. The induced table is


The carrier of the hypothesis bimonoid is M = {row(1),row(a),row(b)}, but the composition row(a) \$ row(a) cannot be defined since row(aa) ∈ M.

The absence of the issue described above is captured with *closedness*.

**Definition 10 (Closed table).** *An observation table* S, E *is* closed *if for all* <sup>t</sup> <sup>∈</sup> <sup>S</sup><sup>+</sup> *there exists* <sup>s</sup> <sup>∈</sup> <sup>S</sup> *such that* row(s) = row(t)*.*

Another issue that may occur is that the same row being represented by different index pomsets leads to an inconsistent definition of the structure. The absence of this issue is referred to as *consistency*.

**Definition 11 (Consistent table).** *An observation table* S, E *is* consistent *if for all* s1, s<sup>2</sup> ∈ S *such that* row(s1) = row(s2) *we have for all* t ∈ S *that*

row(s<sup>1</sup> · t) = row(s<sup>2</sup> · t) row(t · s1) = row(t · s2) row(s<sup>1</sup> t) = row(s<sup>2</sup> t).

Whenever closedness and consistency hold, one can define sequential and parallel composition operations on the rows of the table. However, these operations are not guaranteed to be associative, as we show with the following example.

*Example 12 (Table not associative).* Consider L = {au : u ∈ {b} ∗ } over Σ = {a, <sup>b</sup>}, and suppose <sup>S</sup> <sup>=</sup> {1, <sup>a</sup>, <sup>b</sup>} and <sup>E</sup> <sup>=</sup> {-, a}. The induced table is:


This table does not lead to an associative sequential operation on rows:

(row(a) 5 row(b)) 5 row(a) = row(ab) 5 row(a) = row(a) 5 row(a) = row(aa) = row(ab) = row(a) 5 row(b) = row(a) 5 row(ba) = row(a) 5 (row(b) 5 row(a)).

To prevent this issue we enforce the following additional property:

**Definition 13 (Associative table).** *Let* ♥ ∈ {·, }*. An observation table* S, E *is* ♥-associative *if for all* s1, s2, s3, sl, s<sup>r</sup> ∈ S *with* row(sl) = row(s<sup>1</sup> ♥ s2) *and* row(sr) = row(s<sup>2</sup> ♥ s3) *we have* row(s<sup>l</sup> ♥ s3) = row(s<sup>1</sup> ♥ sr)*. An observation table is* associative *if it is both* ·*-associative and -associative.*

The table from Example 12 is *not* ·-associative: we have row(a) = row(ab) and row(b) = row(ba) but row(aa) = row(ab).

Putting the above definitions of closedness, consistency and associativity of tables together, we have the following result for constructing a hypothesis.

**Lemma 14 (Hypothesis).** *A closed, consistent and associative table* S, E *induces a* hypothesis *pomset recogniser* <sup>H</sup> <sup>=</sup> H, <sup>5</sup><sup>H</sup>, <sup>H</sup>, **<sup>1</sup>**H, iH, F<sup>H</sup> *where*

$$H = \{ \mathsf{row}(s) : s \in S \} \qquad \qquad \mathsf{row}(s\_1) \odot\_H \mathsf{row}(s\_2) = \mathsf{row}(s\_1 \cdot s\_2)$$

row(s1) <sup>H</sup> row(s2) = row(s<sup>1</sup> <sup>s</sup>2) **<sup>1</sup>**<sup>H</sup> <sup>=</sup> row(1) <sup>i</sup>H(a) = row(a)

$$F\_H = \{ \mathsf{row}(s) : s \in S, \mathsf{row}(s)(\square) = 1 \}.$$

*Proof.* The operations <sup>5</sup><sup>H</sup> and <sup>H</sup> are well-defined by closedness and consistency, and **1**<sup>H</sup> is well-defined because 1 ∈ S by the observation table definition. Commutativity of <sup>H</sup> follows from commutativity of , and similarly that **<sup>1</sup>**<sup>H</sup> is a unit for both operations follows from 1 being a unit. Associativity follows by associativity of the table (it does *not* follow from · and being associative: given elements <sup>s</sup>1, s2, s<sup>3</sup> <sup>∈</sup> <sup>S</sup>, <sup>s</sup><sup>1</sup> · <sup>s</sup><sup>2</sup> · <sup>s</sup><sup>3</sup> is not necessarily present in <sup>S</sup> <sup>∪</sup> <sup>S</sup><sup>+</sup>).

Since a hypothesis is constructed from an observation table S, E that records for given s ∈ S and e ∈ E whether e[s] is accepted by the language or not, one would expect that the hypothesis classifies those pomsets

$$T\_{\langle S, E \rangle} = \{ e[s] : s \in S, e \in E \}$$

correctly. This is not necessarily the case, as we show in the following example.

*Example 15.* Consider the language L from Example 7, and let S = {1, b} and <sup>E</sup> <sup>=</sup> {-, a(-b)}. The induced table is

$$
\begin{array}{c|c|c}
\hline
&\Box&\mathsf{a}(\Box\rangle\ \mathsf{b})\\
\hline
1&0&0\\
\mathsf{b}&1&1\\
\end{array}
\qquad\qquad
\begin{array}{c|c|c}
&\Box&\mathsf{a}(\Box\ \mathsf{b})\\
\hline
\mathsf{a}&0&0\\
\mathsf{b}&0&0\\
\mathsf{b}&0&0\\
\end{array}
$$

From this closed, consistent, and associative table we obtain a hypothesis pomset recogniser that satisfies

$$\begin{aligned} (\mathsf{row}(\mathsf{a}) \odot (\mathsf{row}(\mathsf{b}) \odot \mathsf{row}(\mathsf{b})))(\square) &= (\mathsf{row}(\mathsf{a}) \odot \mathsf{row}(\mathsf{b} \parallel \mathsf{b}))(\square) \\ &= (\mathsf{row}(\mathsf{a}) \odot \mathsf{row}(1))(\square) = \mathsf{row}(\mathsf{a})(\square) = 0 \neq 1 \end{aligned}$$

and thus recognises a language that differs from L on a · (b b) ∈ TS,E .

We thus have the following definition, parametric in a subset of TS,E .

**Definition 16 (Compatible hypothesis).** *A closed, consistent, and associative observation table* S, E *induces a hypothesis* H *that is* X-compatible *with its table, for* X ⊆ SP*, if for* x ∈ X *we have* x ∈ L<sup>H</sup> ⇐⇒ x ∈ L*. We say that the hypothesis is* compatible *with its table if it is* TS,E *-compatible with its table.*

Ensuring hypotheses are compatible with their table will not be a crucial step in proving termination, but plays a key role in ensuring minimality (Section 3.4). This was originally shown by van Heerdt [14] for Mealy machines.

#### **3.2 The Learning Algorithm**

We are now ready to introduce our learning algorithm, Algorithm 1. The main algorithm initialises the table to {1}, {-} and starts by augmenting the table to make sure it is closed and associative. We give an example below.


Algorithm 1: The pomset recogniser learning algorithm.

*Example 17 (Fixing closedness and associativity).* Consider the table from Example 9, where row(aa) ∈ {row(1),row(a),row(b)} witnesses a closedness defect. To fix this, the algorithm would add aa to the set S, which means row(aa) will become part of the carrier of the hypothesis.

Now consider the table from Example 12. Here we found an associativity defect witnessed by row(a) = row(ab) and row(b) = row(ba) but row(aa) = row(ab). More specifically, row(aa)(-) <sup>=</sup> row(ab)(-). Thus, s<sup>1</sup> = s<sup>3</sup> = s<sup>l</sup> = a, s<sup>2</sup> = s<sup>r</sup> = b, s<sup>l</sup> = a, and e = -. A membership query on aba shows aba ∈ L, so b = 0. We have row(aa)(-) = 0, and therefore the algorithm would add the context -[<sup>a</sup> · -] = <sup>a</sup> · to E.

Note that the algorithm does not explicitly check for consistency; this is because we actually ensure a stronger property—sharpness [3]—as an invariant (Lemma 25). This property ensures every row indexed by a pomset in S is indexed by exactly one pomset in S (implying consistency):

**Definition 18 (Sharp table).** *An observation table* S, E *is* sharp *if for all* s1, s<sup>2</sup> ∈ S *such that* row(s1) = row(s2) *we have* s<sup>1</sup> = s2*.*

The idea of maintaining sharpness is due to Maler and Pnueli [29].

Once the table is closed and associative, we construct the hypothesis and check if it is compatible with its table. If this is not the case, a witness for incompatibility is a counterexample by definition, so HandleCounterexample is invoked to extract an extension of E, and we return to checking closedness and associativity. Once we obtain a hypothesis that is compatible with its table, we submit it to the teacher to check for equivalence with the target language. If the teacher provides a counterexample, we again process this and return to checking closedness and associativity. Once we have a compatible hypothesis for which there is no counterexample, we return this correct pomset recogniser.

The procedure HandleCounterexample, adapted from [7,8], is provided with an observation table S, E a pomset z, and a context c and finds a single context to add to E. The main invariant is that c[z] is a counterexample. Recursive calls replace subpomsets from S<sup>+</sup> with elements of S in this counterexample while maintaining the invariant. There are two types of return values: if c is a suitable context, c is returned; otherwise the return value is an element of <sup>S</sup> that is to replace <sup>z</sup>. The context <sup>c</sup> is suitable if <sup>z</sup> <sup>∈</sup> <sup>S</sup><sup>+</sup> and adding <sup>c</sup> to E would distinguish row(s) from row(z), where s ∈ S is such that currently row(s) = row(z). Because <sup>S</sup> is non-empty and subpomset-closed, if <sup>z</sup> ∈ <sup>S</sup> <sup>∪</sup> <sup>S</sup><sup>+</sup> it can be decomposed into z = u<sup>1</sup> ♥ u<sup>2</sup> for non-empty u1, u<sup>2</sup> ∈ SP and ♥ ∈ {·, }. We then recurse into u<sup>1</sup> and u<sup>2</sup> to replace them with elements of S and replace <sup>z</sup> with <sup>u</sup><sup>1</sup> ♥ <sup>u</sup><sup>2</sup> <sup>∈</sup> <sup>S</sup><sup>+</sup> in a final recursive call. If <sup>c</sup> <sup>=</sup> -, the return value cannot be in S, as we will show in Lemma 25 that these elements are not counterexamples.

*Example 19 (Processing a counterexample).* Consider L = {a, aa, a a}, and let <sup>S</sup> <sup>=</sup> {1, <sup>a</sup>} and <sup>E</sup> <sup>=</sup> {-}. This induces a closed, sharp, and associative table

$$\begin{array}{c|c} \begin{array}{c} \Box \\ \hline 1 \\ 1 \end{array} & \begin{array}{c} \\ 1 \end{array} & \begin{array}{c} \\ \hline 1 \\ \hline \end{array} & \begin{array}{c} \\ \hline 1 \\ \hline \end{array} \end{array} \end{array} \qquad \begin{array}{c} \begin{array}{c} \Box \\ \hline \hline \text{aa} \\ \hline 1 \\ \hline \end{array} \end{array}$$

Suppose an equivalence query on its pomset recogniser, which rejects only the empty pomset, gives counterexample z = a a aa. We may decompose z as (- aa)[<sup>a</sup> <sup>a</sup>], where <sup>a</sup> <sup>a</sup> <sup>∈</sup> <sup>S</sup><sup>+</sup> \ <sup>S</sup>. Because row(<sup>a</sup> <sup>a</sup>) = row(a), (- aa)[a] = a aa, and a aa ∈ L ⇐⇒ z ∈ L, we update z = a aa and repeat the process. Now we decompose <sup>z</sup> = (<sup>a</sup> -)[aa]. Since row(aa) = row(a), (<sup>a</sup> -)[a] = <sup>a</sup> <sup>a</sup>, and <sup>a</sup> <sup>a</sup> ∈ L ⇐⇒ <sup>z</sup> ∈ L, we finish by adding <sup>a</sup> to E.

### **3.3 Termination and Query Complexity**

Our termination argument is based on a comparison of the current observation table with the infinite table SP, PC. We first show that the latter induces a hypothesis, called the *canonical pomset recogniser* for the language. Its underlying bimonoid is isomorphic to the syntactic bimonoid [28] for the language.

**Lemma 20.** SP, PC *is a closed, consistent, and associative observation table.*

**Definition 21 (Canonical pomset recogniser).** *The* canonical pomset recogniser *for* L *is the the hypothesis for the observation table* SP, PC*. We denote this hypothesis by* <sup>M</sup>L, <sup>5</sup>L, <sup>L</sup>, **<sup>1</sup>**L, iL, F<sup>L</sup> *.*

The comparison of the current table with SP, PC is in terms of the number of distinct rows they hold. In the following lemma we show that the number of the former is bounded by the number of the latter.

**Lemma 22.** *If* M<sup>L</sup> *is finite, any observation table* S, E *satisfies*

$$|\{\text{row}(s) : s \in S\}| \le |M\_{\mathcal{L}}|.$$

*Proof.* Note that M<sup>L</sup> = {rowSP,PC (s) : s ∈ S}. Given s1, s<sup>2</sup> ∈ S such that rowS,E (s1) = rowS,E (s2) we have rowSP,PC (s1) = rowSP,PC (s2). This implies |{row(s) : s ∈ S}| ≤ |ML|.

An important fact will be that none of the pomsets in S can form a counterexample for the hypothesis of a table S, E. In order to show this we will first show that the hypothesis is always *reachable*, a concept we define for arbitrary pomset recognisers below.

**Definition 23 (Reachability).** *A pomset recogniser* <sup>R</sup> <sup>=</sup> M, <sup>5</sup>, , **<sup>1</sup>**, i, F *is* reachable *if for all* m ∈ M *there exists* u ∈ SP *such that* i (u) = m*.*

Our reachability lemma relies on the fact that S is subpomset-closed.

**Lemma 24 (Hypothesis reachability).** *Given a closed, consistent, and associative observation table* S, E*, the hypothesis it induces is reachable. In particular,* iH (s) = row(s) *for any* s ∈ S*.*

From the above it follows that we always have compatibility with respect to the set of row indices, as we show next.

**Lemma 25.** *The hypothesis of any closed, consistent, and associative observation table* S, E *is* S*-compatible.*

Before turning to our termination proof, we show that some simple properties hold throughout a run of the algorithm.

**Lemma 26 (Invariant).** *Throughout execution of Algorithm 1, we have that* S, E *is a sharp observation table.*

*Proof.* Subpomset-closedness holds throughout each run since {1} is subpomsetclosed and adding a single element of S<sup>+</sup> to S preserves the property.

For sharpness, first note that the initial table is sharp as it only has one row. Sharpness of S, E can only be violated when adding elements to S. But the only place where this happens is on line 7, and there the new row is unequal to all previous rows, which means sharpness is preserved.

The preceding results allow us to prove our termination theorem.

**Theorem 27 (Termination).** *If* M<sup>L</sup> *is finite, then Algorithm 1 terminates.*

*Proof.* First, we observe that fixing a closedness defect by adding a row (line 7) can only happen finitely many times, since, by Lemma 22, the size of {row(s) : s ∈ S} is bounded by ML.

This means that it suffices to show the following two points:


Combined, these show that the algorithm terminates. For the first point, we treat each of the cases:


Alternatively, row(s<sup>l</sup> ♥ s3)(e) = b means row(s<sup>1</sup> ♥ sr)(e) = b, for otherwise we would contradict row(s<sup>l</sup> ♥ s3)(e) = row(s<sup>1</sup> ♥ sr)(e). For similar reasons the context e[s<sup>1</sup> ♥ -] in this case distinguishes the previously equal rows row(s<sup>1</sup> ♥ s2) and row(sr), creating a closedness defect.

**–** A compatibility defect results in the identification of a counterexample, the handling of which we discuss next.

**–** Whenever a counterexample is identified, we eventually find a context c, <sup>s</sup> <sup>∈</sup> <sup>S</sup>, and <sup>t</sup> <sup>∈</sup> <sup>S</sup><sup>+</sup> \ <sup>S</sup> such that row(t) = row(s) and <sup>c</sup>[t] ∈ L ⇐⇒ <sup>c</sup>[s] ∈ L. Thus, adding c to E creates a closedness defect.

Termination of HandleCounterexample follows: the first two recursive calls in the procedure replace z with strict subpomsets of z, whereas the last one replaces <sup>z</sup> with an element of <sup>S</sup><sup>+</sup>, so no further recursion will happen.

*Query Complexity.* We determine upper bounds on the membership and equivalence query numbers of a run of the algorithm in terms of the size of the canonical pomset recogniser n = |ML|, the size of the alphabet k = |Σ|, and the maximum number of operations (from {·, }, used to compose alphabet symbols) m found in a counterexample. We note that since the number of distinct rows indexed by S is bounded by n and the table remains sharp throughout any run, the final size of <sup>S</sup> is at most <sup>n</sup>. Thus, the final size of <sup>S</sup><sup>+</sup> is in <sup>O</sup>(n<sup>2</sup> <sup>+</sup> <sup>k</sup>). Given the initialisation of S with a single element, the number of closedness defects fixed throughout a run is at most n − 1. This means that the total number of associativity defects fixed and counterexamples handled (including those resulting from compatibility defects) together is n − 1. We can already conclude that the number of equivalence queries posed is bounded by n. Moreover, we know that the final table will have at most n columns, and therefore the total number of cells in that table will be in <sup>O</sup>(n<sup>3</sup> <sup>+</sup> kn).

The number of membership queries posed during a run of the algorithm is given by the number of cells in the table plus the number of queries needed during the processing of counterexamples. Consider the counterexample z that contains the maximum number of operations among those encountered during a run. The first two recursive calls of HandleCounterexample break down one operation, whereas the third is used to execute a base case making two membership queries and does not lead to any further recursion. The number of membership queries made starting from a given counterexample is thus in O(m). This means the total number of membership queries during the processing of counterexamples is in O(mn), from which we conclude that the number of membership queries posed during a run is in <sup>O</sup>(n<sup>3</sup> <sup>+</sup> mn <sup>+</sup> kn).

#### **3.4 Minimality of Hypotheses**

In this section we will show that all hypotheses submitted by the algorithm to the teacher are minimal. We first need to define what minimality means. As is the case for DFAs, it is the combination of an absence of unreachable states and of every state exhibiting its own distinct behaviour.

**Definition 28 (Minimality).** *A pomset recogniser* <sup>R</sup> <sup>=</sup> M, <sup>5</sup>, , **<sup>1</sup>**, i, F *is* minimal *if it is reachable and for all* u, v ∈ SP *with* i (u) <sup>=</sup> <sup>i</sup> (v) *there exists* c ∈ PC *such that* c[u] ∈ L<sup>R</sup> ⇐⇒ c[v] ∈ LR*.*

Before proving the main result of this section, we need the following:

**Lemma 29.** *For all pomset recognisers* M, <sup>5</sup>, , **<sup>1</sup>**, i, F *and* u, v <sup>∈</sup> SP *such that* i (u) = i (v) *we have for any* <sup>c</sup> <sup>∈</sup> PC *that* <sup>i</sup> (c[u]) = i (c[v])*.*

The minimality theorem below relies on table compatibility, which allows us to distinguish the behaviour of states based on the contents of their rows. Note that the algorithm only submits a hypothesis in an equivalence query if that hypothesis is compatible with its table.

**Theorem 30 (Minimality of hypotheses).** *A closed, consistent, and associative observation* S, E *induces a minimal hypothesis if the hypothesis is compatible with its table.*

*Proof.* We obtain the hypothesis from Lemma 14. Since S is subpomset-closed, we have by Lemma 24 that the hypothesis is reachable. Moreover, for every s ∈ S we have iH (s) = row(s). Consider <sup>u</sup>1, u<sup>2</sup> <sup>∈</sup> SP such that <sup>i</sup>H (u1) <sup>=</sup> <sup>i</sup>H (u2). Then there exist <sup>s</sup>1, s<sup>2</sup> <sup>∈</sup> <sup>S</sup> such that row(s1) = <sup>i</sup>H (u1) and row(s2) = iH (u2), and we have row(s1) = row(s2). Let e ∈ E be such that row(s1)(e) = row(s2)(e). We have

$$\begin{aligned} \, \_iH^\sharp(e[u\_1]) \in F\_H &\iff i\_H \, \_i^\sharp(e[s\_1]) \in F\_H &\text{(Lemma 29)}\\ &\iff e[s\_1] \in \mathcal{L}\_\mathcal{H} \\ &\iff \text{row}(s\_1)(e) = 1 \\ &\iff \text{row}(s\_2)(e) = 0 \\ &\iff e[s\_2] \notin \mathcal{L}\_\mathcal{H} \\ &\iff i\_H \, \_i^\sharp(e[s\_2]) \notin F\_H \\ &\iff i\_H \, \_i^\sharp(e[u\_2]) \notin F\_H. \end{aligned}$$

As a corollary, we find that the canonical pomset recogniser is minimal.

**Proposition 31.** *The canonical pomset recogniser is minimal.*

#### **4 Conversion to Pomset Automata**

Bimonoids are a useful representation of pomset languages because sequential and parallel composition are on an equal footing; in the case of the learning algorithm of the previous section, this helps us treat both operations similarly. On the other hand, the behaviour of a program is usually thought of as a series of actions, some of which involve launching two or more threads that later combine. Here, sequential actions form the basic unit of computation, while fork/join patterns of threads are specified separately. *Pomset automata* [22] encode this more asymmetric model: they can be thought of as non-deterministic finite automata with an additional transition type that brokers forking and joining threads.

In this section, we show how to convert a pomset recogniser to a certain type of pomset automaton, where acceptance of a pomset is guided by its structure; conversely, we show that each of the pomset automata in this class can

be represented by a pomset recogniser. Together with the previous section, this establishes that the languages of pomset automata in this class are learnable.

If S is a set, we write M(S) for the set of *finite multisets* over S. A finite multiset over S is written φ = {|s1,...,s<sup>n</sup>|}.

**Definition 32 (Pomset automata).** *A* pomset automaton *(PA) is a tuple* A = Q, I, F, δ, γ *where*


*Lastly, for every* <sup>q</sup> <sup>∈</sup> <sup>Q</sup> *there are finitely many* <sup>φ</sup> <sup>∈</sup> <sup>M</sup>(Q) *such that* <sup>γ</sup>(q, φ) <sup>=</sup> <sup>∅</sup>*.*

A finite PA can be represented graphically: every state is drawn as a vertex, with accepting states doubly circled and initial states pointed out by an arrow, while δ-transitions are represented by labelled edges, and γ-transitions are drawn as a multi-ended edge. For instance, in Figure 1a, we have drawn a PA with states q<sup>0</sup> through q<sup>5</sup> with q<sup>5</sup> accepting, and q<sup>1</sup> ∈ δ(q0, a) (among other δ-transitions), while the multi-ended edge represents that q<sup>2</sup> ∈ γ(q1, {|q3, q4|}), i.e., q<sup>2</sup> can launch threads starting in q<sup>3</sup> and q4, which, upon termination, resume in q2.

Fig. 1: Some pomset automata.

The sequential transition function is interpreted as in non-deterministic finite automata: if q ∈ δ(q, a), then a machine in state q may transition to state q after performing the action a. The intuition to the parallel transition function is that if q ∈ γ(q, {|r1,...,r<sup>n</sup>|}), then a machine in state q may launch threads starting in states r<sup>1</sup> through rn, and when each of those has terminated succesfully, may proceed in state q . Note how the representation of starting states in a γtransition allows for the possibility of launching multiple instances of the same thread, and disregards their order—i.e., γ(q, {|r1,...,r<sup>n</sup>|}) = γ(q, {|rn,...,r1|}). This intuition is made precise through the notion of a *run*.

**Definition 33 (Run relation).** *The* run relation *of a PA* A = Q, I, F, δ, γ*, denoted* →<sup>A</sup>*, is defined as the the smallest subset of* Q × SP × Q *satisfying*

$$\begin{array}{ccc} \frac{q' \in \delta(q, \mathbf{a})}{q \xrightarrow{\mathbf{a}}\_{A} q'} & \frac{q' \in \delta(q, \mathbf{a})}{q \xrightarrow{\mathbf{a}}\_{A} q'} & \frac{q' \in \gamma(q, \|r\_{1}, \dots, r\_{n}\|)}{q \xrightarrow{u\_{1} \| \cdot \cdots \| u\_{n}} q'} & \frac{q' \xrightarrow{v}\_{A} q'}{q \xrightarrow{v}\_{A} q'} \end{array}$$

*The* language accepted *by* <sup>A</sup> *is* <sup>L</sup><sup>A</sup> <sup>=</sup> {<sup>u</sup> <sup>∈</sup> SP : <sup>∃</sup><sup>q</sup> <sup>∈</sup> I,q <sup>∈</sup> F. q <sup>u</sup> −→<sup>A</sup> <sup>q</sup> }*.*

*Example 34.* If A is the PA from Figure 1a, we can see that q<sup>3</sup> <sup>b</sup> −→<sup>A</sup> <sup>q</sup><sup>5</sup> and q<sup>4</sup> <sup>c</sup> −→<sup>A</sup> <sup>q</sup><sup>5</sup> as a result of the second rule; by the third rule, we find that <sup>q</sup><sup>1</sup> <sup>b</sup><sup>c</sup> −−→<sup>A</sup> <sup>q</sup>2. Since q<sup>2</sup> <sup>a</sup> −→ <sup>q</sup><sup>5</sup> and <sup>q</sup><sup>0</sup> <sup>a</sup> −→<sup>A</sup> <sup>q</sup><sup>1</sup> (again by the second rule), we can conclude <sup>q</sup><sup>0</sup> <sup>a</sup>·(bc)·<sup>a</sup> −−−−−→<sup>A</sup> <sup>q</sup><sup>5</sup> by repeated application of the last rule. The language accepted by this PA is the singleton set {a · (b c) · a}.

In general, finite pomset automata can accept a very wide range of pomset languages, including all context free (pomset) languages [23]. The intuition behind this is that the mechanism of forking and joining encoded in γ can be used to simulate a call stack. For example, the automaton in Figure 1b accepts the strictly context-free language (of words) {a<sup>n</sup> · <sup>b</sup><sup>n</sup> : <sup>n</sup> <sup>∈</sup> <sup>N</sup>}. It follows that PAs can represent strictly more pomset languages than pomset recognisers. To tame the expressive power of PAs at least slightly, we propose the following.

**Definition 35 (Saturation).** *We say that* A = Q, I, F, δ, γ *is* saturated *when for all* u, v ∈ SP *with* u, v = 1*, both of the following are true:*

	- r <sup>u</sup> −→<sup>A</sup> <sup>r</sup> <sup>s</sup> <sup>v</sup> −→<sup>A</sup> <sup>s</sup> <sup>q</sup> <sup>∈</sup> <sup>γ</sup>(q, {|r, s|})

*Example 36.* Returning to Figure 1, we see that the PA in Figure 1a is saturated, while Figure 1b is not, as a result of the run <sup>q</sup><sup>1</sup> <sup>a</sup>·a·b·<sup>b</sup> −−−−→<sup>A</sup> <sup>q</sup>4, which does not admit an intermediate state <sup>q</sup> such that <sup>q</sup><sup>1</sup> <sup>a</sup>·<sup>a</sup> −−→<sup>A</sup> <sup>q</sup> and <sup>q</sup> <sup>b</sup>·<sup>b</sup> −−→<sup>A</sup> <sup>q</sup>4.

We now have everything in place to convert the encoding of a language given by a pomset recogniser to a pomset automaton. The idea is to represent every element q of the bimonoid by a state which accepts exactly the language of pomsets mapped to q; the transition structure is derived from the operations.

**Lemma 37.** *Let* <sup>R</sup> <sup>=</sup> M, <sup>5</sup>, , **<sup>1</sup>**, i, F *be a pomset recogniser. We construct the pomset automaton* A = M, F, {**1**}, δ, γ *(note: we use* F *as the set of initial states) where* <sup>δ</sup> : <sup>M</sup> <sup>×</sup> <sup>Σ</sup> <sup>→</sup> <sup>2</sup><sup>M</sup> *and* <sup>γ</sup> : <sup>M</sup> <sup>×</sup> <sup>M</sup>(M) <sup>→</sup> <sup>2</sup><sup>M</sup> *are given by*

$$\delta(q, \mathbf{a}) = \{q' : i(\mathbf{a}) \odot q' = q\} \qquad \gamma(q, \phi) = \{q' : (r \odot r') \odot q' = q, \ \phi = \{r, r'\}\}$$

*Then* A *is saturated, and* L<sup>A</sup> = LR*.*

*Example 38.* Let M, <sup>5</sup>, , **<sup>1</sup>**, i, F be the pomset recogniser from Example 7. The pomset automaton that arises from the construction above is partially depicted in Figure 2; we have not drawn the state q<sup>⊥</sup> and its incoming transitions, or forks into **1**, to avoid clutter. In this PA, we see that, since q<sup>a</sup> 5 q<sup>1</sup> = q<sup>b</sup> and <sup>i</sup>(a) = <sup>q</sup>a, we have <sup>q</sup><sup>1</sup> <sup>∈</sup> <sup>δ</sup>(qb, <sup>a</sup>). Furthermore, since (q<sup>b</sup> <sup>q</sup>b) <sup>5</sup> **<sup>1</sup>** <sup>=</sup> <sup>q</sup><sup>1</sup> <sup>5</sup> **<sup>1</sup>** <sup>=</sup> <sup>q</sup>1, we also have **1** ∈ γ(q1, {|qb, qb|}). Finally, q<sup>b</sup> is initial, since F = {qb}.

Fig. 2: Part of the PA obtained from the pomset recogniser from Example 7, using the construction from Lemma 37. The state q<sup>⊥</sup> (which does not contribute to the language of the automaton) and forks into the state **1** are not pictured.

We have thus shown that the language of any pomset recogniser can be accepted by a finite and saturated PA. In turn, this shows that our algorithm can, in principle, be adapted to work with a teacher that takes a (saturated) PA instead of a pomset recogniser as hypothesis, by simply converting the hypothesis pomset recogniser to an equivalent PA before sending it over.

Conversely, we can show that the transition relations of a saturated PA carry the algebraic structure of a bimonoid, and use that to show that a language recognised by a saturated PA is also recognised by a bimonoid. This shows that our characterisation is "tight", i.e., languages recognised by saturated PAs are precisely those recognised by bimonoids, and hence learnable.

**Lemma 39.** *Let* A = Q, I, F, δ, γ *be a saturated pomset automaton. We can construct a pomset recogniser* <sup>R</sup> <sup>=</sup> M, <sup>5</sup>, , **<sup>1</sup>**, i, F *, where*

$$M = \{ \begin{array}{c} \underline{u}\_A : u \in \mathsf{SP} \} \\ \end{array} \qquad \begin{array}{c} \underline{u}\_A \odot \stackrel{v}{\rightarrow} \underline{v}\_A = \stackrel{u \cdot v}{\longrightarrow}\_A \end{array} \qquad \begin{array}{c} \stackrel{u}{\rightarrow}\_A \odot \stackrel{v}{\rightarrow}\_A = \stackrel{u \| v}{\longrightarrow}\_A \end{array}$$

$$i(\mathsf{a}) = \begin{array}{c} \underline{a} \\ \rightarrow\_A \end{array} \qquad \qquad F' = \{ \begin{array}{c} \underline{u}\_A \in M : \exists q \in I, q' \in F. \, q \stackrel{u}{\rightarrow}\_A q' \end{array} \}$$

*Now* <sup>5</sup> *and are well-defined, and* <sup>R</sup> *is a pomset recogniser such that* <sup>L</sup><sup>R</sup> <sup>=</sup> <sup>L</sup><sup>A</sup>*.*

If A is finite, then so is R, since each of the elements of M is a relation on Q, and there are finitely many relations on a finite set.

In general, the PA obtained from a pomset recogniser may admit runs where the same fork transition is nested repeatedly. Recognisable pomset languages of *bounded width* may be recognised by a pomset recogniser that is *depthnilpotent* [28], which can be converted into a *fork-acyclic* PA by way of an sr-expression [28,22]. However, this detour via sr-expressions is not necessary: one can adapt Lemma 37 to produce a fork-acyclic PA, when given a depthnilpotent pomset recogniser. The details are discussed in the full version [15].

We conclude this section by remarking that the minimal pomset recogniser for a bounded-width language is necessarily depth-nilpotent [28]; since our algorithm produces a minimal pomset recogniser, this means that we can also produce a fork-acyclic PA after learning a bounded-width recognisable pomset language.

#### **5 Discussion**

To learn DFAs, there are several alternatives to the observation table data structure that reduce the space complexity of the algorithm. Most notable is the *classification tree* [25], which distinguishes individual pairs of words (which for us would be pomsets) at every node rather than filling an entire row for each of them. The TTT algorithm [19] further builds on this and achieves optimal space complexity. Given that we developed the first learning algorithm for pomset languages, we opted for the simplicity of the observation table—optimisations such as those analogous to the aforementioned work are left to future research.

We would like to extend our algorithm to learn recognisers based on arbitrary algebraic theories. One challenge is to ensure that the equations of the theory hold for hypotheses, by generalising our definition of associativity (Definition 13).

Our algorithm can also be specialised to learn languages recognised by commutative monoids. These languages of *multisets* can alternatively be represented as semi-linear sets [30] or described using Presburger arithmetic [11]. While not all languages described this way are recognisable (for instance, the set of multisets over Σ = {a, b} with as many a's as b's [28]), it would be interesting to be able to learn at least the fragment representable by commutative monoids, and apply that to one of the domains where semi-linear sets are used.

Our algorithm is limited to learning languages of series-parallel pomsets; there exist pomsets which are not series-parallel, each of which must contain an "N-shape" [12,13,35]. Since N-shapes appear in pomsets that describe message passing between threads, we would like to be able to learn such languages as well. We do not see an obvious way to extend our algorithm to include these pomsets, but perhaps recent techniques from [10] can provide a solution.

Every hypothesis of our algorithm can be converted to a pomset automaton. The final pomset recogniser for a bounded-width language is minimal, and hence depth-nilpotent [28], which means that it can be converted to a fork-acyclic PA. In future work, we would like to guarantee that the same holds for intermediate hypotheses when learning a bounded-width language.

Running two threads in parallel may be implemented by running some initial section of those threads in parallel, followed by running the remainder of those threads in parallel. This interleaving is represented by the *exchange law* [12,13]. One can specialise pomset recognisers to include this interleaving to obtain recognisers of pomset languages closed under subsumption [28], i.e., such that if a pomset u is recognised, then so are all of the "more sequential" versions of u. We would like to adapt our algorithm to learn these types of recognisers, and exploit the extra structure provided by the exchange law to optimise further.

We have shown that recognisable pomset languages correspond to saturated regular pomset languages (Lemmas 37 and 39). One question that remains is whether there is an algorithm that can learn all or at least a larger class of regular pomset languages. Given that pomset automata can accept context-free languages (Figure 1b), we wonder if a suitable notion of context-free grammars for pomset languages could be identified. Clark [6] showed that there exists a subclass of context-free languages that can be learned via an adaptation of L . Arguably, this adaptation learns recognisers with a monoidal structure and reverses this structure to obtain a grammar. An extension of this work to pomset languages might lead to a learning algorithm that learns more PAs.

### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

#### **The Structure of Sum-Over-Paths, its Consequences, and Completeness for Clifford***-*

Renaud Vilmart(-)

Universit´e Paris-Saclay, ENS Paris-Saclay, Inria, CNRS, LMF, 91190, Gif-sur-Yvette, France vilmart@lsv.fr

**Abstract.** We show that the formalism of "Sum-Over-Path" (SOP), used for symbolically representing linear maps or quantum operators, together with a proper rewrite system, has the structure of a daggercompact PROP. Several consequences arise from this observation:

– Morphisms of SOP are very close to the diagrams of the graphical calculus called ZH-Calculus, so we give a system of interpretation between the two

– A construction, called the discard construction, can be applied to enrich the formalism so that, in particular, it can represent the quantum measurement.

We also enrich the rewrite system so as to get the completeness of the Clifford fragments of both the initial formalism and its enriched version.

**Keywords:** Categorical Quantum Mechanics · Dagger-Compact PROP · Sum-Over-Paths · Clifford Fragment · Normal Form · Rewriting · Discard Construction · Verification.

### **1 Introduction**

The "Sum-Over-Paths" (SOP) formalism [1] was introduced in order to perform verification on quantum circuits. It is inspired by Feynman's notion of pathintegrals, and can be conceived as a discrete version of it.

The core idea here is to represent unitary transformations in a symbolic way, so as to be able to simplify the term, which would for instance accelerate its evaluation. To do so, the formalism comes equipped with a rewrite system, which reduces any term into an equivalent one.

As pure quantum circuits (which represent unitary maps) can easily be mapped to an SOP morphism, one can try and perform verification: given a specification S and another SOP morphism t obtained from a circuit supposed to implement the specification, we can compute the term S ◦t † and try to reduce it to the identity. In a very similar way, one can check whether two quantum circuits implement the same unitary map.

 This work was made during a Postdoc funded by the project PIA-GDN/Quantex. Proofs can be found at arXiv:2003.05678

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 531–550, 2021. https://doi.org/10.1007/978-3-030-71995-1 27

The rewrite system is known to be complete for Clifford unitary maps, i.e. in the Clifford fragment of quantum mechanics, the term obtained from t<sup>1</sup> ◦ t † <sup>2</sup> will reduce to the identity iff t<sup>1</sup> and t<sup>2</sup> represent the same unitary map. Moreover, this reduction terminates in time polynomial in the size of the SOP term (itself related to the size of the quantum circuit), and still performs well outside the Clifford fragment.

Lately, the SOP formalism has been used for efficient verification of optimisation strategies such as [4,12], as well as for specification of quantum circuits [6].

In this paper, we are interested in extensions of the formalism. We first focus on its categorical structure, and show that arbitrary terms already go beyond the representation of unitary maps. We then turn to extending the formalism to encompass mixed quantum processes. In both cases, we show a completeness result for their respective Clifford fragment.

In Section 2, we explain in details the structure of †-compact PROP, which we show in Section 3 to be shared by **SOP**.

Because the formalism is no longer restricted to unitary maps, we argue that it could benefit from a slight redefinition, which is done in Section 4.

Another "family" of categories that share this structure is the family of graphical languages for quantum computation: ZX-Calculus, ZW-Calculus and ZH-Calculus [3,7,8]. All three formalisms represent morphisms of **Qubit** using diagrams, and come with equational theories, proven to be complete for the whole category [3,11,19], i.e. whenever two diagrams represent the same morphism of **Qubit**, the first can be turned into the other using only the equational theory.

In Section 5, we present interpretations between the respective Clifford fragments of the ZH-calculus and **SOP**, in a slightly different way than in [14,15], partly thanks to our redefinition of sums-over-paths.

In Section 6, we realise that the original rewrite system of **SOP** is not enough for the completeness of the Clifford fragment of **Qubit**. We hence enrich the set of rules so as to get the completeness in this restriction.

In Section 7, we enrich the whole formalism using the discard construction [5], so as to be able to represent completely positive maps, as well as the operator of partial trace. Again, one can consider the Clifford fragment of this formalism. We give a new set of rewrite rules, and show that it makes the fragment complete.

### **2 Background**

### **2.1 PROPs and String Diagrams**

The first kind of category we will be interested in is the *PROP* [13,20]. A PROP **C** is a strict symmetric monoidal category (SMC) [16,18] generated by a single object, or equivalently, whose objects form N. Hence the morphisms of **C** are of the form f : n → m. They can be composed sequentially (. ◦ .) or in parallel (. ⊗ .), and they satisfy the following axioms:

$$f \circ (g \circ h) = (f \circ g) \circ h \qquad \qquad f \otimes (g \otimes h) = (f \otimes g) \otimes h$$

$$\begin{aligned} id\_m \circ f = f = f \circ id\_n & \quad id\_0 \otimes f = f = f \otimes id\_0\\ (f\_2 \circ f\_1) \otimes (g\_2 \circ g\_1) &= (f\_2 \circ g\_2) \circ (f\_1 \otimes g\_1) \end{aligned}$$

The category is also equipped with a particular family of morphisms σn,m : n + m → m + n. Intuitively, these allow morphisms to swap places. They satisfy additional axioms:

$$\begin{aligned} \sigma\_{n,m+p} = (\operatorname{id}\_m \otimes \sigma\_{n,p}) \circ (\sigma\_{n,m} \otimes \operatorname{id}\_p) \\ \sigma\_{m,n} \circ \sigma\_{n,m} = \operatorname{id}\_{n+m} \end{aligned} \qquad \begin{aligned} \sigma\_{n+m,p} = (\sigma\_{n,p} \otimes \operatorname{id}\_m) \circ (\operatorname{id}\_n \otimes \sigma\_{m,p}) \\ (\operatorname{id}\_p \otimes f) \circ \sigma\_{n,p} = \sigma\_{m,p} \circ (f \otimes \operatorname{id}\_p) \end{aligned}$$

### **2.2** *†***-Compact PROPs**

Some PROPs can have additional structure, such as a compact-closed structure, or a †-functor.

A †-PROP **C** is a PROP together with an involutive, identity-on-objects functor (.)† : **<sup>C</sup>**op <sup>→</sup> **<sup>C</sup>** compatible with (. <sup>⊗</sup> .). That is, for every morphism f : n → m, there is a morphism f † : m → n such that f †† = f. It behaves with the compositions by (f ◦ g)† = g† ◦ f † and (f ⊗ g)† = f † ⊗ g†. Finally, we have σ† n,m = σm,n.

A †-compact PROP has two particular families of morphisms: η<sup>n</sup> : 0 → 2n and <sup>n</sup> : 2n → 0. These are dual by the †-functor: η† <sup>n</sup> = n. They satisfy the following axioms:

$$\begin{aligned} (\epsilon\_n \otimes id\_n) \circ (id\_n \otimes \eta\_n) &= id\_n = (id\_n \otimes \epsilon\_n) \circ (\eta\_n \otimes id\_n), \\ \sigma\_{n,n} \circ \eta\_n = \eta\_n &\qquad \eta\_{n+m} = (id\_n \otimes \sigma\_{n,m} \otimes id\_m) \circ (\eta\_n \otimes \eta\_m) \end{aligned}$$

In this context, one can define the transpose operator of a morphism f as:

$$f^t := (\epsilon\_m \otimes id\_n) \circ (id\_m \otimes f \otimes id\_n) \circ (id\_m \otimes \eta\_m)^\*$$

One can check that, thanks to the axioms of †-compact PROP, (<sup>f</sup> ◦ <sup>g</sup>)<sup>t</sup> <sup>=</sup> <sup>g</sup><sup>t</sup> ◦ <sup>f</sup><sup>t</sup> , (<sup>f</sup> <sup>⊗</sup> <sup>g</sup>)<sup>t</sup> <sup>=</sup> <sup>f</sup><sup>t</sup> <sup>⊗</sup> <sup>g</sup><sup>t</sup> , and ftt = f.

We can then compose (.)<sup>t</sup> and (.)†: (.) := (.)†<sup>t</sup> . Again using the axioms of †-compact PROP, one can check that (.)†<sup>t</sup> = (.)<sup>t</sup>†.

#### **2.3 Example: Qubit**

The usual example of a strict symmetric †-compact monoidal category is **FHilb**, the category whose objects are finite dimensional Hilbert spaces, and whose morphisms are linear maps between them. It is not, however, a PROP, as it is not generated by a single object.

One subcategory of **FHilb** that *is* a PROP, though, is **Qubit**, the subcategory of **FHilb** generated by the object C<sup>2</sup>, considered as the object 1. A morphism <sup>f</sup> : <sup>n</sup> <sup>→</sup> <sup>m</sup> of **<sup>Q</sup>ubit** is hence a linear map from <sup>C</sup>2<sup>n</sup> to C2<sup>m</sup> . (. ◦ .) is then the usual composition of linear maps, and (.⊗.) is the usual tensor product of linear maps. One can check that the first set of axioms is satisfied.

This is not enough to conclude that **Qubit** is a PROP. We still need to define a family of morphisms <sup>σ</sup>n,m. In the Dirac notation, given a basis <sup>B</sup> of <sup>C</sup><sup>2</sup>, we can define σn,m as σn,m := (*x*,*y*)∈Bn×B<sup>m</sup> |*y*, *x x*, *y*|. One can then check that all the axioms of PROPs are satisfied.

**Qubit** is not only a PROP, but also †-compact. Indeed, first, given a morphism:

$$f = \sum\_{(x,y)\in\mathcal{B}^n\times\mathcal{B}^m} a\_{x,y}|y\rangle\langle x|$$

we can define its dagger f † := (*x*,*y*)∈Bn×B<sup>m</sup> a*x*,*<sup>y</sup>* |*x y*|, which is the usual defini-

tion of the dagger for linear maps.

Its compact structure can be given by η<sup>n</sup> := *<sup>x</sup>*∈B<sup>n</sup> |*x*, *x*, which implies <sup>n</sup> = η† <sup>n</sup> = *<sup>x</sup>*∈B<sup>n</sup> *x*, *x*|. One can check that all the axioms of †-compact PROPs are satisfied.

Since **<sup>Q</sup>ubit** is †-compact, we can define the transpose (.)<sup>t</sup> which happens to be the usual transpose of linear maps, and the conjugate (.), which again is the usual conjugation in linear maps over C.

There is a subcategory of **Qubit** that is of importance: **Stab**. It is the smallest †-compact subcategory of **Qubit** (the compact structure is preserved) that contains:

$$\begin{array}{l} - & |0\rangle : 0 \to 1\\ - & H := \frac{1}{\sqrt{2}} (|0\rangle\langle 0| + |0\rangle\langle 1| + |1\rangle\langle 0| - |1\rangle\langle 1|) : 1 \to 1\\ - & S := |0\rangle\langle 0| + i \, |1\rangle\langle 1| : 1 \to 1\\ - & CZ := |00\rangle\langle 00| + |01\rangle\langle 01| + |10\rangle\langle 10| - |11\rangle\langle 11| : 2 \to 2 \end{array}$$

### **3 The Category SOP**

#### **3.1 SOP as a PROP**

The point of the Sum-Over-Paths formalism [1], is to *symbolically* manipulate morphisms written in a form akin to the Dirac notation. Reasoning on symbolic terms allow us to detect where a term can be simplified to a "smaller" one, or to give a specification on a term.

A morphism of the category will be of the form: <sup>|</sup>*x*<sup>→</sup> <sup>s</sup> *<sup>y</sup>*∈<sup>V</sup> <sup>k</sup> <sup>e</sup><sup>2</sup>iπP (*x*,*y*) <sup>|</sup>*Q*(*x*, *<sup>y</sup>*) where:


We may denote V<sup>f</sup> a subset of the variables V used in f. Then by default, if V<sup>f</sup> and <sup>V</sup><sup>g</sup> are used in the same term, we consider that <sup>V</sup><sup>f</sup> <sup>∩</sup> <sup>V</sup><sup>g</sup> <sup>=</sup> <sup>∅</sup>. To distinguish the two sum operators (the one in P and the one in *Q*), we can denote the one in the output signature *Q* as ⊕. Moreover, it will sometimes be necessary to immerse one of the boolean polynomials Q<sup>i</sup> in the polynomial P. We hence define Q #<sup>i</sup> inductively as <sup>x</sup>\$ <sup>=</sup> <sup>x</sup> for a variable <sup>x</sup>, pq\$ <sup>=</sup> <sup>p</sup>\$q\$ and <sup>p</sup><sup>⊕</sup> <sup>q</sup> <sup>=</sup> <sup>p</sup>\$<sup>+</sup> <sup>q</sup>\$<sup>−</sup> <sup>2</sup>pq\$ .

**Definition 1 (SOP). SOP** *is defined as the PROP where, given a set of variables* V *:*

→ |*x*
→ s *<sup>y</sup>*∈<sup>V</sup> <sup>k</sup> <sup>e</sup><sup>2</sup>iπP (*x*,*y*) <sup>|</sup>*Q*(*x*, *<sup>y</sup>*) *where* <sup>s</sup> <sup>∈</sup> <sup>R</sup>*, <sup>x</sup>* <sup>∈</sup> <sup>V</sup> <sup>n</sup>*,* <sup>P</sup> <sup>∈</sup> <sup>R</sup>[X1,...,X<sup>n</sup>+<sup>k</sup>]/(1, X<sup>2</sup> <sup>i</sup> − Xi)*, and <sup>Q</sup>* <sup>∈</sup> (F2[X1,...,X<sup>n</sup>+<sup>k</sup>])<sup>m</sup>

$$f \circ g := |x\_g| \mapsto s\_f s\_g \sum\_{\substack{\mathcal{P} \\ \mathcal{Y}\_f \in V\_f^{k\_f} \\ \mathcal{Y}\_g \in V\_g^{k\_g}}} e^{2i\pi (P\_g + P\_f[\mathfrak{a}\_f \leftarrow \overline{\mathcal{Q}\_g}])} \left| \mathcal{Q}\_f[\![x\_f \leftarrow \!\!Q\_g] \!] \right\rangle$$

**–** *Tensor product is obtained as*

$$f \otimes g := |\mathbf{x}\_f \mathbf{x}\_g| \mapsto s\_f s\_g \sum\_{\substack{\mathbf{y}\_f \in V\_f^{k\_f} \\ \mathbf{y}\_g \in V\_g^{k\_g}}} e^{2i\pi (P\_g + P\_f)} \left| \mathbf{Q}\_f \mathbf{Q}\_g \right\rangle\_g$$

**–** *The symmetric braiding is* σn,m : |*x*1, *x*2 
→ |*x*2, *x*<sup>1</sup> 

The polynomial P is called the *phase polynomial*, as it appears in the morphism in e<sup>2</sup>iπ.. Because of this, we consider the polynomial modulo 1. We also consider the polynomial quotiented by <sup>X</sup><sup>2</sup> <sup>−</sup> <sup>X</sup> for all its variables <sup>X</sup>, as these variables are to be evaluated in {0, <sup>1</sup>}, so we consider <sup>X</sup><sup>2</sup> <sup>=</sup> <sup>X</sup>.

Notice that the definition of the identities does not directly fit the description of the morphisms. However, we can rewrite it as |*x* 
→ |*x* = |*x* 
→ 1 <sup>y</sup>∈<sup>V</sup> <sup>0</sup> <sup>e</sup><sup>2</sup>iπ<sup>0</sup> <sup>|</sup>*<sup>x</sup>*. Hence, when we sum over a single element, we may forget the

sum operator, and when the phase polynomial is 0, we may not write it. Notice by the way that id<sup>0</sup> = | 
→ |. Indeed, | is absolutely valid, it represents an empty register.

*Example 1.* We can give the **SOP** version of the usual quantum gates:

$$\begin{aligned} R\_Z(\alpha) &:= |x\rangle \mapsto e^{2i\pi \frac{\alpha \pi}{2\pi}} |x\rangle \\ H &:= |x\rangle \mapsto \frac{1}{\sqrt{2}} \sum\_{y \in V} e^{2i\pi \frac{xy}{2}} |y\rangle \end{aligned} \qquad \begin{aligned} C &:= |x\_1, x\_2\rangle \mapsto |x\_1, x\_1 \oplus x\_2\rangle \\ CZ &:= |x\_1, x\_2\rangle \mapsto e^{2i\pi \frac{x\_1 x\_2}{2}} |x\_1, x\_2\rangle \end{aligned}$$

*Example 2.* Let us derive the operation (id ⊗ H) ◦ *CNot*:

$$(id \otimes H) \circ CNot$$

$$\begin{aligned} &= \left( |x\_1, x\_2\rangle \mapsto \frac{1}{\sqrt{2}} \sum\_{y \in V} e^{2i\pi \frac{x\_2 y}{2}} |x\_1, y\rangle \right) \circ \left( |x\_1, x\_2\rangle \mapsto |x\_1, x\_1 \oplus x\_2\rangle \right) \\ &= |x\_1, x\_2\rangle \mapsto \frac{1}{\sqrt{2}} \sum\_{y \in V} e^{2i\pi \frac{(x\_1 + x\_2 - 2x\_1 x\_2)y}{2}} |x\_1, y\rangle \end{aligned}$$

where <sup>x</sup><sup>1</sup> <sup>+</sup> <sup>x</sup><sup>2</sup> <sup>−</sup> <sup>2</sup>x1x<sup>2</sup> <sup>=</sup> <sup>x</sup><sup>1</sup> <sup>⊕</sup> <sup>x</sup>2.

The previous definition contains a claim: that **SOP** is a PROP. To be so, one has to check all the axioms of PROPs. One has to be careful when doing so. Indeed, the sequential composition (. ◦ .) induces a substitution. Hence, one has to check all the axioms in the presence of a "context", that is, one has to show that the axioms can be applied *locally*.

If an axiom states t<sup>1</sup> → t2, one should ideally check that A◦(id<sup>n</sup> ⊗t<sup>1</sup> ⊗idm)◦ B → A ◦ (id<sup>n</sup> ⊗ t<sup>2</sup> ⊗ idm) ◦ B for any "before" morphism B and any "after" morphism A. However, this can be easily reduced to checking that A ◦ t<sup>1</sup> ◦ B → A ◦ t<sup>2</sup> ◦ B.

In the case of the axioms of PROPs, this can further be reduced to showing the axioms without context, as neither id<sup>n</sup> nor σn,m introduce variables or phases. For the other axioms, however, the context will have to be taken into account. A fairly straightforward but tedious verification gives that, indeed, **SOP** is a PROP.

#### **3.2 From SOP to Qubit**

To check the soundness of what we are going to do in the following, it may be interesting to have a way of interpreting morphisms of **SOP** as morphisms of **Qubit**.

**Definition 2.** *The functor* -. : **<sup>S</sup>O<sup>P</sup>** <sup>→</sup> **<sup>Q</sup>ubit** *is defined as being identity on objects, and such that*

$$\left\{ |x\rangle \mapsto s \sum\_{y \in V^k} e^{2i\pi P(x,y)} \left| Q(x,y) \right\rangle \right\} := \sum\_{(x,y) \in \{0,1\}^n \times \{0,1\}^k} e^{2i\pi P(x,y)} \left| Q(x,y) \right\rangle \langle x| $$

*Example 3.* The interpretation of H is as intended the Hadamard gate:

$$\left[\left[H\right]\right] = \frac{1}{\sqrt{2}} \sum\_{x, y \in \{0, 1\}} e^{2i\pi \frac{xy}{2}} \left|y\right\rangle \left|x\right| = \frac{1}{\sqrt{2}} \left( \left|0\right\rangle \left|0\right| + \left|0\right\rangle \left|1\right| + \left|1\right\rangle \left|0\right| - \left|1\right\rangle \left|1\right\rangle \right)$$

**Proposition 1.** *The interpretation* -. *is a* PROP-functor*, meaning: i)* -. ◦ . <sup>=</sup> -. ◦ -.*, ii)* -. <sup>⊗</sup> . <sup>=</sup> -. <sup>⊗</sup> -.*, iii)* <sup>σ</sup>n,m <sup>=</sup> <sup>σ</sup>n,m

### **3.3 SOP as a** *†***-Compact PROP**

**Towards a Compact Structure.** It is tempting to try and adapt the compact structure of **Qubit** to **SOP**. To do so, we can first define η<sup>n</sup> := | 
→ *<sup>y</sup>*∈<sup>V</sup> <sup>n</sup> |*y*, *y*.

However, we cannot as easily define n. To do so, we need to put the phase polynomial to use: <sup>n</sup> := <sup>|</sup>*x*1, *<sup>x</sup>*2<sup>→</sup> <sup>1</sup> 2n *<sup>y</sup>*∈<sup>V</sup> <sup>n</sup> <sup>e</sup><sup>2</sup>iπ *<sup>x</sup>*1·*y*+*x*2·*<sup>y</sup>* <sup>2</sup> |.

One can easily check that <sup>n</sup> <sup>=</sup> n. We can also easily check that the axioms of †-compact PROP where <sup>n</sup> does not appear, such as σn,n ◦ η<sup>n</sup> = η<sup>n</sup> and (id<sup>n</sup> ⊗ σn,m ⊗ idm) ◦ (η<sup>n</sup> ⊗ ηm) = η<sup>n</sup>+<sup>m</sup> are satisfied.

However, the equation (<sup>n</sup> ⊗ idn) ◦ (id<sup>n</sup> ⊗ ηn) = id<sup>n</sup> = (id<sup>n</sup> ⊗ n) ◦ (η<sup>n</sup> ⊗ idn) is not satisfied, as:

$$(\epsilon\_n \otimes id\_n) \circ (id\_n \otimes \eta\_n) = |\mathbf{x}\rangle \mapsto \frac{1}{2} \sum\_{y\_1, y\_2 \in V^n} e^{2i\pi \frac{\mathbf{z} \cdot y\_2 + y\_1 \cdot y\_2}{2}} |y\_1\rangle \neq id\_n$$

The fact that we have (<sup>n</sup> ⊗ idn) ◦ (id<sup>n</sup> ⊗ ηn) = id<sup>n</sup> while its interpretation in **Qubit** holds, hints at a way to *rewrite* the first term as the second.

**An Equational Theory.** A rewrite strategy is given in [1], and we show in Figure 1 the rules we are going to use in the paper. Each rewrite rule contains a condition, which usually ensures that a variable (the one we want to get rid of) does not appear in some polynomials. We hence use Var as the operator that gets all the variables from a sequence of polynomials. For simplicity, the input signature is omitted, as well as the parameters in the polynomials.

$$\sum\_{y} e^{2i\pi P} \left| \mathbf{Q} \right\rangle\_{y\_0 \notin \operatorname{Var}(P, \mathbf{Q})} 2 \sum\_{y \nmid \{y\_0\}} e^{2i\pi P} \left| \mathbf{Q} \right\rangle \tag{\text{Elim}}$$

$$\sum\_{\mathbf{y}} e^{2i\pi \left(\frac{y\_0}{2}(y\_0'+\widehat{Q}\_2)+R\right)} \left| \mathbf{Q} \right> \quad \longrightarrow \quad 2 \sum\_{\substack{\mathbf{y} \neq \mathbf{0} \ \mathrm{Var}(R,Q\_2,\mathbf{Q}) \\ y\_0' \notin \mathrm{Var}(Q\_2)}} e^{2i\pi \left\{ R \left[ y\_0' \leftarrow \widehat{Q}\_2 \right] \right\}} \left| \mathbf{Q} \left[ y\_0' \leftarrow Q\_2 \right] \right\rangle \quad \text{(HH)}$$

$$\sum\_{y} e^{2i\pi \left(\frac{y\_0}{4} + \frac{y\_0}{2}\right)\widehat{Q}\_2 + R} \left| \mathbf{Q} \right\rangle\_{y\_0 \notin \text{Var}(Q\_2, R, \mathbf{Q})} \sqrt{2} \sum\_{y \nmid \{y\_0\}} e^{2i\pi \left(\frac{1}{8} - \frac{1}{4}\widehat{Q}\_2 + R\right)} \left| \mathbf{Q} \right\rangle \qquad (\omega)$$

**Fig. 1.** Rewrite strategy −→Clif .

−→ Clif denotes the rewrite system formed by the three rules (Elim), (HH) and (ω). <sup>∗</sup> −→ Clif is the transitive closure of the rewrite system. Notice that all the rules remove at least one variable from the morphism, so we know −→ terminates.

Clif When the rules are not oriented, we get an equivalence relation on the morphisms of **SOP**. We denote this equivalence ∼ Clif.

We denote **SOP**/ ∼ Clif the category **<sup>S</sup>O<sup>P</sup>** quotiented by the equivalence relation ∼ Clif.

It is to be noticed that:

**Proposition 2.** *For any rule* r *of* −→ Clif *and* t1, t<sup>2</sup> ∈ **SOP***:*

$$t\_1 \underset{r}{\longrightarrow} t\_2 \implies \begin{cases} A \diamond t\_1 \circ B \underset{r}{\longrightarrow} A \diamond t\_2 \circ B & \text{for all } A \text{ and } B \text{ composable} \\ A \otimes t\_1 \otimes B \underset{r}{\longrightarrow} A \otimes t\_2 \otimes B & \text{for all } A \text{ and } B \end{cases}$$

*This obviously generalises to* ∼ Clif*.*

This result allows us to forget about the context in the rewriting process.

The newly obtained category **SOP**/ ∼ Clif is still a PROP. It even has a compact structure, as the last necessary axiom is now derivable:

$$(\epsilon \otimes id) \circ (id \otimes \eta) = |x\rangle \mapsto \frac{1}{2} \sum\_{y\_1, y\_2 \in V} e^{2i\pi(\frac{y\_1 y\_2}{2} + \frac{x y\_2}{2})} |y\_1\rangle \xrightarrow[\text{(HH)}]{} |x\rangle \mapsto |x\rangle = id\overline{\eta}$$

and similarly for (id ⊗ ) ◦ (η ⊗ id) = id.

*†***-Functor for SOP.** To show that **SOP**/ ∼ Clif is †-compact, we lack a notion of †-functor **SOP**.

Remember that we defined (.) as (.)†<sup>t</sup> . Since we have a compact structure, we can already define the functor (.)<sup>t</sup> . Thanks to the new equivalence relation ∼ Clif, t

this functor is involutive. Hence, we have (.)† = (.) . An appropriate definition of the conjugation can be given:

**Definition 3.** *The conjugation is defined as:*

$$\overline{|x\rangle \mapsto s\_f \sum e^{2i\pi P\_f} |Q\_f\rangle} := |x\rangle \mapsto s\_f \sum e^{-2i\pi P\_f} |Q\_f\rangle$$

By combination of (.)<sup>t</sup> this gives a definition of (.)†. These three functors are the expected ones:

**Proposition 3.** -(.)<sup>t</sup> <sup>=</sup> -. t , (.) = -. , (.)† = -. †

We can finally prove the wanted result:

**Theorem 1. SOP**/ ∼ Clif *is a* †*-compact PROP.*

### **4 Redefinition of SOP**

In **Qubit**, and hence in **SOP**, because the strutures are †-compact, it may feel unnatural to have an asymmetry between inputs and outputs of the process. Why not have morphisms of the form f = s *<sup>y</sup>* <sup>e</sup><sup>2</sup>iπP <sup>|</sup>*<sup>O</sup> <sup>I</sup>*|? In this case, we have to change the definition of the composition, which has for consequence that the **SOP** morphisms do not form a category. However, it is a category when quotiented by ∼ Clif. This is the reason why we did not define **<sup>S</sup>O<sup>P</sup>** like this at first, although it greatly simplifies the notions of compact structure and †-functor.

We now redefine **SOP**, and will use this new definition in the rest of the paper:

**Definition 4 (SOP).** *We redefine* **SOP** *as the collection of objects* N *and morphisms between them:*


As announced, this is not a category, as id ◦ id <sup>=</sup> <sup>1</sup> 2 *<sup>y</sup>* <sup>e</sup><sup>2</sup>iπ <sup>y</sup>1+y<sup>2</sup> <sup>2</sup> <sup>y</sup><sup>3</sup> |y<sup>2</sup>y1| = <sup>y</sup> |yy| = id. This problem is solved by reintroducing the rewrite rules, adapted to the new formalism. In the following, references to the rewrite rules are to their adapted version.

The results given for the previous formalisation can easily be adapted. In particular:

**Proposition 4. SOP**/ ∼ Clif *is a* †*-compact PROP, and* -. *is a* †*-compact PROPfunctor.*

*Remark 1.* When building a **SOP**-morphism t from a circuit (or a diagram as we will show in the following) in this formalism, provided the complexity of the gates is bounded (e.g. in the gateset H, RZ(α), *CNot* ), the resulting t is always of size O(d × n) where n is the size of the register, and d the *depth* of the circuit (and for a diagram in O(G × a) where G is the number of generators and a the maximum arity of these generators). This contrasts with the first definition of **SOP**, where the size of the constructed **SOP** term gets exponential in general.

#### **5 SOP and Graphical Languages**

The sum-over-paths formalism was initially intended to be used for isometries. As such, it was given a weak form of completeness – as we will discuss in the next section. However, if transforming a quantum circuit – that describes an isometry – into an **SOP** morphism is easy, the converse, transforming a **SOP** morphism into a circuit is not. And actually, all **SOP** morphisms do not represent an isometry. For instance, the morphism <sup>1</sup> described above is not an isometry. An even smaller example is <sup>y</sup> |y| which is a valid **SOP** morphism, but clearly does not represent an isometry.

Monoidal categories, and subsequently PROPs, have the benefit of having a nice graphical representation, using string diagrams. The fact that **SOP** is one hints at another (family) of language(s) more suited for representing it: the Z∗- Calculi: ZX, ZW and ZH [7,8,10,3]. These are all †-compact graphical languages, that have an interpretation in **Qubit**, and are universal for **Qubit**. This means that any morphism of **Qubit** can be represented as a morphism of either of these 3 languages.

The language that happens to be the closest to **SOP** is the ZH-Calculus. This is the one we are going to present in the following. However, bear in mind that, as we have semantics-preserving functors between any two of these three languages, one can do the same work with ZX and ZW-Calculi.

The link between the sum-over-paths formalism and the ZH-Calculus was first shown in [14,15]. We give here a slightly different but equivalent presentation, that in particular uses the fact that we altered the formalism of **SOP**, and we will focus this presentation to the Clifford fragment, as it is sufficient for the scope of the present article, although a more general presentation could be given (see the previous two references, or the longer version of the present article).

#### **5.1 The Cliffrord Fragment of the ZH-Calculus**

**ZH**Clif is a PROP whose morphisms are composed (sequentially (. ◦ .) or in parallel (. ⊗ .)) from the generators ... ... , , <sup>e</sup>iα and <sup>s</sup> ; where <sup>α</sup> <sup>∈</sup> <sup>π</sup> 2 Z and <sup>s</sup> <sup>∈</sup>√2, e<sup>i</sup> <sup>π</sup> <sup>4</sup>  the multiplicative group freely generated by <sup>√</sup>2 and <sup>e</sup><sup>i</sup> <sup>π</sup> 4 .

**ZH**Clif is made a †-compact PROP, which means it also has the symmetric structure σn,m :: <sup>n</sup>... ... m ... ..., the compact structure η<sup>n</sup> :: ... <sup>n</sup> ... <sup>n</sup> , <sup>n</sup> :: <sup>n</sup>... <sup>n</sup>... , and a †-functor (.)† : **ZH**op Clif → **ZH**Clif.

For convenience, we define two additional spiders: ...

$$\begin{array}{ccccc} \begin{array}{ccccc} \dots & \dots & \dots & \dots & \dots & \dots \\ \dots & \dots & & & \dots \\ \dots & \dots & & & \dots \end{array} & \begin{array}{c} \dots & \dots & \dots & \dots & \dots \\ \dots & \dots & & & \dots \\ \dots & \dots & & & \dots \end{array} & \cdots & \dots \\ \dots & \dots & & & \dots \end{array}$$

The full language comes with a way of interpreting the morphisms as morphisms of **Qubit**, and whose restriction to **ZH**Clif maps to **Stab**. The standard interpretation -. : **ZH**Clif <sup>→</sup> **Stab** is a †-compact-PROP-functor, defined as:

$$\begin{aligned} \left\| \begin{array}{c} \cdots\\ \left\| \begin{array}{c} \\ \cdots \end{array} \right\| \right\| &= \left| 0^m \rangle \langle 0^n \right| + \left| 1^m \rangle \langle 1^n \right|, \qquad \left[ \begin{array}{c} \left\| \begin{array}{c} \\ \bullet \end{array} \right\| \end{array} \right] = \sum\_{x, y \in \{0, 1\}} (-1)^{xy} \left| y \right\rangle \langle x \right|, \\\ \left[ \begin{array}{c} \overline{\left[ e^{i\alpha} \right]} \end{array} \right] &= \left| 0 \right\rangle + e^{i\alpha} \left| 1 \right\rangle, \qquad \left[ \begin{array}{c} \left\| \begin{array}{c} \overline{s} \end{array} \right\| \end{array} \right] = s \end{aligned} $$

Notice that we used the same symbol for two different functors: the two interpretations -. : **<sup>S</sup>O<sup>P</sup>** <sup>→</sup> **<sup>Q</sup>ubit** and -. : **ZH**Clif <sup>→</sup> **Stab**. It should be clear from the context which one is to be used.

The language is universal for **Stab**:

**Proposition 5.** -. : **ZH**Clif <sup>→</sup> **Stab** *is onto, i.e.*

$$\forall f \in \mathbf{Stab}, \; \exists D\_f \in \mathbf{ZH}\_{\text{Clif}}, \quad \lbrack D\_f \rbrack = f$$

Since it is not a 1-to-1 correspondence, the language comes with an equational theory, which in particular gives the axioms for a †-compact PROP. We will not present it here.

#### **5.2 From ZHClif to SOP**

We show in this section how any **ZH**Clif morphism can be turned into a **SOP** morphism in a way that preserves the semantics. We define [.] sop : **ZH**Clif <sup>→</sup> **SOP** as the †-compact PROP-functor such that:

$$\left[\begin{array}{c}\raisebox{1.0pt}{\left[\begin{array}{c}\cdots\\\left\uparrow\end{array}\right.}\\\end{array}\right]^{\mathrm{sop}}:=\sum\_{y}\left|y,\ldots,y\right\rangle\langle y,\ldots,y|\qquad\left[\begin{array}{c}\left[\begin{array}{c}\vdots\\\hline\end{array}\right]^{\mathrm{sop}}:=\sum\_{y\_{0},y\_{1}}e^{2i\pi\frac{y\_{0}y\_{1}}{2}}\ |y\_{0}\rangle\langle y\_{1}|\\\end{array}\right.$$

$$\left[\begin{array}{c}\left[\begin{array}{c}\overline{e^{i\pi}}\end{array}\right]^{\mathrm{sop}}:=\sum\_{y}e^{2i\pi\frac{\alpha}{2\pi}y}\ |y\rangle\qquad\left[\begin{array}{c}\left[\begin{array}{c}\alpha\overline{e^{i\theta}}\end{array}\right]^{\mathrm{sop}}:=\rho\sum\_{\emptyset}e^{2i\pi\frac{\theta}{2\pi}}\ |\rangle\langle\}\quad\text{for }\rho e^{i\theta}\in\langle\sqrt{2},e^{i\frac{\pi}{4}}\rangle\qquad\text{(4.1)}$$

This interpretation can be extended to the full graphical language. It preserves the semantics:

**Proposition 6.** -[.] sop <sup>=</sup> -.*.*

#### **5.3 The Clifford Fragment of SOP**

Since **ZH**Clif is universal for **Stab**, the Clifford fragment of **Qubit**, and since we have an interpretation [.] sop : **ZH**Clif <sup>→</sup> **<sup>S</sup>O<sup>P</sup>** that preserves the semantics, we can define **<sup>S</sup>OP**Clif as the the image of **ZH**Clif by -.. This gives a characterisation of the fragment:

**Definition 5. SOP**Clif *is the subPROP of* **SOP** *with the same objects, and whose morphisms are of the form* <sup>1</sup> √2 p e<sup>2</sup>iπ( <sup>1</sup> <sup>8</sup> <sup>P</sup> (0)<sup>+</sup> <sup>1</sup> <sup>4</sup> <sup>P</sup> (1)<sup>+</sup> <sup>1</sup> <sup>2</sup> <sup>P</sup> (2)) <sup>|</sup>*<sup>O</sup> <sup>I</sup>*<sup>|</sup> *where* P(i) *is a polynomial with integer coefficients of degree at most* i *(hence* P(0) *is in fact merely an integer); and where all the* O<sup>i</sup> *and* I<sup>i</sup> *are linear.*

It is an easy check that [**ZH**Clif] sop <sup>⊆</sup> **<sup>S</sup>OP**Clif, so **<sup>S</sup>OP**Clif has enough morphisms to describe the Clifford fragment of quantum computing. We can even show it exactly captures it. To do so, we introduce an interpretation from **SOP**Clif back to **ZH**Clif.

#### **5.4 From SOPClif to ZHClif**

We define [.] ZH : **<sup>S</sup>OP**Clif <sup>→</sup> **ZH**Clif on arbitrary **<sup>S</sup>OP**Clif morphisms as:

where the row of Z-spiders represents the variables y1,...,yk.

The inputs of O<sup>i</sup> are linked to y1,...,yk. The nodes O<sup>i</sup> can be inductively defined as:

Notice that we did not define how to interpret a product Q1Q2. This can be done for the interpretation of the full **SOP** category, but it is unnecessary for **SOP**Clif where the O<sup>i</sup> are linear. The nodes I<sup>i</sup> are defined similarly, but upside-down. The node P can be inductively defined as:

The obtained diagram can then be reduced using usual rules of **ZH**.

The system of interpretations is close to preserving the structure of the terms:

**Proposition 7.** - [.] ZH.sop <sup>∼</sup> Clif (.)

#### **Corollary 1.** [.] ZH = -.*.*

This result allows us to prove **SOP**Clif does capture the Clifford fragment of quantum mechanics:

**Proposition 8.** -. : **<sup>S</sup>OP**Clif <sup>→</sup> **Stab***, the restriction of the standard interpretation to* **SOP**Clif *is onto* **Stab***.*

#### **6 A Complete Rewrite System for Clifford**

In [1], where the rewrite rules are introduced, the author gives a notion of completeness for Clifford *unitaries*, that we will refer to in the following as "weak completeness":

**Proposition 9 (Weak Completeness for Clifford Unitaries).** *Given two terms* <sup>t</sup>1*,* <sup>t</sup><sup>2</sup> *of* **<sup>S</sup>OP**Clif *such that* <sup>t</sup><sup>i</sup> ◦ ti † <sup>=</sup> id <sup>=</sup> ti † ◦ <sup>t</sup><sup>i</sup>*, we have:*

$$[t\_1 \circ t\_2^\dagger \xrightarrow[\text{Clif}]{\ast} id \quad \Longleftrightarrow \quad [t\_1] = [t\_2]\_\times$$

In practice, this is sufficient for deciding the equivalence of two Clifford quantum circuits, as they are represented as unitary morphisms of **SOP**Clif. However, in our case, where we deal with more than unitaries, we cannot use this trick. Instead, we aim at a result like "t<sup>1</sup> <sup>∗</sup> −→ <sup>t</sup> <sup>∗</sup> ←− <sup>t</sup><sup>2</sup> ⇐⇒ <sup>t</sup><sup>1</sup> <sup>=</sup> <sup>t</sup><sup>2</sup>". In other words, we want a rewrite system that will transform any term of **SOP**Clif into a unique normal form. However, the rewrite system −→ Clif is not enough for this:

**Lemma 1.** −→ Clif *is not confluent in* **<sup>S</sup>OP**Clif*.*

To address this problem, we propose to add three rewrite rules to the previously presented ones. These new rewrite rules are shown in Figure 2.

$$\sum e^{2i\pi(P)}|O\_1,...,\underbrace{y\_0 \oplus O\_i'}\_{O\_i},...,O\_m\rangle\langle I| \longrightarrow \sum e^{2i\pi\left(P\left[y\_0 \leftarrow \widehat{O\_i}\right]\right)}\left(|O\rangle\langle I|\right)\left[y\_0 \leftarrow O\_i\right] \quad \text{(ket)}$$

$$\sum e^{2i\pi(P)} \left| \mathcal{O} \rangle \langle I\_1, \dots, \underbrace{y\_0 \oplus I\_i'}\_{I\_i}, \dots, I\_m \right| \longrightarrow \sum e^{2i\pi \left( P \left[ y\_0 \leftarrow \widehat{I\_i} \right] \right)} \left( |\mathcal{O} \rangle \langle I | \right) \left[ y\_0 \leftarrow I\_i \right] \quad \text{(bra)}$$

$$s\sum\_{y} e^{2i\pi\left(\frac{y\_0}{2}+R\right)}|\mathcal{O}\rangle\langle I| \underset{\left(R\neq 0 \text{ or } \mathcal{O}I\neq 0\right) \wedge y\_0 \notin \text{Var}(R,O,I)}{}{\text{\raisebox{-0.5pt}{ $\mathbb{P}$ }}}\sum\_{y\_0 \notin \text{Var}(R,O,I)} e^{2i\pi\left(\frac{y\_0}{2}\right)}|0,...,0\rangle\langle 0,...,0| \quad \text{\raisebox{-0.5pt}{ $\mathbb{Q}$ }}}\langle Z|$$

**Fig. 2.** Together with those of −→Clif , these rules constitute the rewrite system −→ Clif+.

The last rule (Z) describes what happens for a term that represents the linear map 0. Rule (bra) is simply the continuation of (ket). They explain how to operate suitable changes of variables.

#### **Proposition 10.** *The rewrite system* −→ Clif+ *terminates.*

Not only does this rewrite system terminate, it is confluent in **SOP**Clif and the induced equivalence relation ∼ Clif+ is complete for Clifford. The plan to prove this is by showing that any morphism of **SOP**Clif reduces to a normal form that is unique, up to α-conversion (upcoming Thm. 2). To get there, we first need a few intermediary results.

**Lemma 2.** *Any morphism of* **SOP**Clif *reduces by* −→ Clif+ *to a morphism of the form* <sup>1</sup> √2 p e<sup>2</sup>iπP <sup>|</sup>*<sup>O</sup> <sup>I</sup>*<sup>|</sup> *where:* **–** Var(P) <sup>⊆</sup> Var(*O*, *<sup>I</sup>*) *or* <sup>P</sup> <sup>=</sup> <sup>y</sup><sup>0</sup> <sup>2</sup> *where* y<sup>0</sup> ∈/ Var(*O*, *I*) **–** O<sup>i</sup> = ⎧ ⎨ ⎩ either y<sup>k</sup> or c ⊕ / <sup>y</sup>∈Var(O1,...,Oi−1) cyy *where* c, c<sup>y</sup> ∈ {0, 1} **–** I<sup>i</sup> = ⎧ ⎨ ⎩ either y<sup>k</sup> or c ⊕ / <sup>y</sup>∈Var(*O*,I1,...,Ii−1) cyy *where* c, c<sup>y</sup> ∈ {0, 1}

To start with, we deal with the case where the term represents the null map.

**Proposition 11.** *Let* <sup>t</sup> *be a morphism of* **<sup>S</sup>OP**Clif *such that* <sup>t</sup> = 0*. Then:*

$$t \xrightarrow[\text{Clif+} \atop \text{yo}]{\ast} \sum\_{\text{yo}} e^{2i\pi \frac{\text{y}\_0}{2}} |0, \dots, 0\rangle \langle 0, \dots, 0|$$

**Corollary 2.** *If a morphism* t = <sup>√</sup> 1 2 p e<sup>2</sup>iπP <sup>|</sup>*<sup>O</sup> <sup>I</sup>*<sup>|</sup> *of* **<sup>S</sup>OP**Clif *is irreducible such that* Var(P) <sup>⊆</sup> Var(*O*, *<sup>I</sup>*)*, then* <sup>t</sup> = 0*.*

Before moving on to the completeness by normal forms theorem, we need a result for the uniqueness of the phase polynomial:

**Lemma 3.** *Let* <sup>P</sup><sup>1</sup> *and* <sup>P</sup><sup>2</sup> *be two polynomials of* <sup>R</sup>[X1, ..., Xk]/(1, X<sup>2</sup> <sup>−</sup> <sup>X</sup>)*. We have* <sup>∀</sup>*<sup>x</sup>* ∈ {0, <sup>1</sup>}<sup>k</sup>, P1(*x*) = <sup>P</sup>2(*x*) <sup>=</sup><sup>⇒</sup> (P<sup>1</sup> <sup>=</sup> <sup>P</sup>2)

**Theorem 2.** *Let* <sup>t</sup>1*, and* <sup>t</sup><sup>2</sup> *be two morphisms of* **<sup>S</sup>OP**Clif *such that* <sup>t</sup><sup>1</sup> <sup>=</sup> t2*. Then, there exists* t *in* **SOP**Clif *such that* t<sup>1</sup> <sup>∗</sup> −→ Clif+ <sup>t</sup> <sup>∗</sup> ←− Clif+ t2*, up to* α*-conversion.*

This result is not totally surprising, since, as exposed by [15], the rules of −→ Clif are generalisations of the so-called pivoting and local complementation which can be used to reduce any Clifford ZX (or ZH)-diagram into a *pseudo*-normal form [9,2] there, a diagram can have several different but equivalent pseudo-normal form. The rules introduced to get −→ Clif+ are simply here to further rewrite terms in pseudo-normal form into terms in proper (unique) normal form.

**Corollary 3.** *The equality of morphisms in* **SOP**Clif/ ∼ Clif+ *is decidable in time polynomial in the size of the phase polynomial and in the combined size of the ket/bra polynomials.*

Although the set of rules is confluent in **SOP**Clif, it is not in **SOP**:

**Lemma 4 (Non-confluence).** *The rewrite systems* −→ Clif *and* −→ Clif+ *are not confluent in* **SOP***.*

#### **7 SOP with Discards**

We want in this section to extend **SOP** to be able to express the larger formalism of mixed quantum operators. The discard construction can be used for that purpose, as well as for extending the rewrite system for the Clifford fragment. We finally leverage the previous completeness theorem to get a similar result in this extension.

#### **7.1 The Discard Construction on SOP**

In [5], a construction is given to extend any †-compact PROP for *pure* quantum mechanics to another †-compact PROP for quantum mechanics with environment. This new formalism can also be understood as the previous one, but where on top of it, one can discard the qubits. Because **SOP** fits the requirements, the construction can be applied to it.

First, we have to create the subcategory **SOP**iso of **SOP** that contains all its isometries. The objects of the new category are the same, and its morphisms are {f ∈ **SOP** | f † ◦ f = id}.

These are important, as the isometries are exactly the pure quantum operators that can be discarded. The next step in the construction does just that. We perform the affine completion of **SOP**iso, that is, for every object n, we add a new morphism !<sup>n</sup> : n → 0, and we impose that ! ◦ f =! for any f in the new category, that we denote **SOP**! iso. We also need to impose that !<sup>n</sup>⊗!<sup>m</sup> =!<sup>n</sup>+<sup>m</sup> and !<sup>0</sup> = id0.

Finally, the category **SOP** is obtained as the following pushout in the cate-

gory of SMCs, where the arrows are the inclusion functors:

**SOP**! iso **SOP**

**SOP**iso **SOP**

We write the new morphisms in the form s *<sup>y</sup>*∈<sup>V</sup> <sup>k</sup> <sup>e</sup><sup>2</sup>iπP (*y*) <sup>|</sup>*O*(*y*)!*D*(*y*)*<sup>I</sup>*(*y*)<sup>|</sup>

where the additional *D* is a set of multivariate polynomials of F2. The fact that it is a set, and not a list, already captures some rules on the discard: first permuting qubits and then discarding them is equivalent to discarding them right away. Similarly, copying data and discarding the copies is equivalent to discarding the data right away.

Pure morphisms are those such that *D* = {}. In those, no qubits are discarded. We hence easily induce usual morphisms such as H and *CZ* in the new formalism.

The new morphisms !<sup>n</sup> are given by: !<sup>n</sup> := *<sup>y</sup>*∈<sup>V</sup> <sup>n</sup> |!{y1,...,y<sup>n</sup>} y1,...,y<sup>n</sup>|

In the new formalism, the compositions are obtained exactly like previously, where the resulting set of discarded polynomial is the union of the other two.

It might be useful to be able to give an interpretation to the morphisms of the new formalism. To do so, we use the CPM construction [17] to map morphisms of **SOP** to morphisms of **SOP**.

**Definition 6.** *The map* CPM : **SOP** → **SOP** *is defined as:*

$$\begin{aligned} &s\sum\_{y} e^{2i\pi P} \left| \mathcal{O} \right\rangle ! \mathcal{D} \left\langle I \right| \ \rightarrow \\ & \frac{s^2}{2^{|D|}} \sum\_{y\_1, y\_2, y} e^{2i\pi \left( P(y\_1) - P(y\_2) + \frac{D(y\_1) \cdot y + D(y\_2) \cdot y}{2} \right)} \left| O(y\_1), O(y\_2) \right\rangle \left| I(y\_1), I(y\_2) \right\rangle \end{aligned}$$

We can now define a standard interpretation of **SOP** -morphisms as:

**Definition 7.** *The standard interpretation* -. *of* **<sup>S</sup>O<sup>P</sup>** *is defined as* -. := -CPM(.)*.*

Again, it is easy to transform any morphism of **SOP** in **ZH** and viceversa:

$$\left[s\sum\_{y\in V^k} e^{2i\pi P(y)} \left| O(y) \right\rangle! D(y) \left\langle I(y) \right| \right]^{\mathrm{ZH}} := \begin{pmatrix} \prod\_{1} \cdots \prod\_{I\_m} \\ \cdots \\ \vdots \\ \prod\_{1} \prod\_{y} \left| P \right| \end{pmatrix}\_{\mathbf{J}\_{\mathbf{k}}} \cdot \prod\_{\substack{\mathbf{l} \in \mathbb{Z} \\ \prod\_{\mathbf{l}} Q\_{\mathbf{l}} \prod\_{\mathbf{p}} Q\_{\mathbf{l}} \dots \prod\_{\mathbf{l}} Q\_{\mathbf{l}} \mathbf{l}}} \cdot \prod\_{\mathbf{l} \in \mathbb{Z}} \frac{\left| O\left(\begin{array}{ccc} \prod\_{\mathbf{l}} \cdots \prod\_{\mathbf{l}} \mathbb{Z} \\ \prod\_{\mathbf{p}} \cdots \prod\_{\mathbf{l}} \mathbb{Z} \\ \prod\_{\mathbf{p}} \mathbb{Z} \prod\_{\mathbf{l}} Q\_{\mathbf{l}} \prod\_{\mathbf{p}} Q\_{\mathbf{l}} \dots \prod\_{\mathbf{l}} \mathbb{Z} \end{pmatrix}} \right|\_{\mathbf{k}}$$

and [ ] sop =!1.

#### **7.2 SOP with Discards for Clifford**

The discard construction can be applied to the subcategory **SOP**Clif. We end up with a new category **SOP**Clif, such that the following diagram, whose arrows are inclusions, commutes: **SOP**Clif **SOP SOP**Clif **SOP**

Following the characterisation of **SOP**Clif morphisms, we determine that all the morphisms of **SOP**Clif are of the form: <sup>√</sup> 1 2 p e<sup>2</sup>iπ( <sup>1</sup> <sup>8</sup> <sup>P</sup> (0)<sup>+</sup> <sup>1</sup> <sup>4</sup> <sup>P</sup> (1)<sup>+</sup> <sup>1</sup> <sup>2</sup> <sup>P</sup> (2)) <sup>|</sup>*<sup>O</sup>*!*<sup>D</sup> <sup>I</sup>*<sup>|</sup> where <sup>p</sup> <sup>∈</sup> <sup>Z</sup>, where <sup>P</sup>(i) is a polynomial with integer coefficients and of degree at most i, and where the polynomials of *O*, *D* and *I* are linear.

The rewrite system presented previously can obviously be adapted to the new formalism (when there is a substitution, it has to be applied in !*D* as well). On top of that, the condition that makes **SOP**! iso terminal can be translated as a meta rule which sadly is not easy to apply. Thankfully, the last part of [5] is devoted to showing that this big meta rule can sometimes be replaced by a few small ones. The idea is that, in some cases (in particular in the Clifford fragment), all the isometries can be generated from a finite set of generators. In particular, it is enough to impose the following equations:

$$1 \cdot e^{i\alpha} = 1 \qquad !\_1 \circ |0\rangle = 1 \qquad !\_1 \circ H = ! \qquad !\_1 \circ S = !\_1 \qquad !\_2 \circ CZ = !\_2$$

Based on this, we can give an updated set of rewrite rules fit for the introduction of . Due to the size of this rewrite system, we do not provide it here, but it can be found in the extended version of this paper. The rewrite system is denoted −→ Clif and induces a equivalence relation <sup>∼</sup>Clif . Notice that we can extend CPM to CPM : **<sup>S</sup>O<sup>P</sup>** / <sup>∼</sup>Clif → **SOP**/ ∼ Clif+, which makes it a functor.

**Proposition 12.** *The rewrite system* −→ Clif *terminates.*

We aim to prove a similar result to that of the -free Clifford fragment, that is that the new rewrite system rewrites any morphism of the Clifford fragment into a unique normal form. The idea here it to make use of the previous result.

**Lemma 5.** *Any non-null morphism of* **SOP**Clif *can be reduced to:*

$$\frac{1}{\sqrt{2^p}} \sum\_{y, y\_d} e^{2i\pi \left(\frac{1}{4}P^{(1)}(y) + \frac{1}{2}P^{(2)}(y, y\_d)\right)} |O(y, y\_d)\rangle ! \{y\_d\} \left\langle I(y, y\_d) \right| \text{ where:} $$


$$-\mathop{\mathrm{Var}}^{\mathcal{U}}\_{\mathrm{Var}}^{\nu}(\overline{P}^{(1)}, P^{\langle 2 \rangle}) \subseteq \mathrm{Var}(\mathcal{O}, I, D) \text{ or } P = \frac{y\_0}{2} \text{ with } y\_0 \notin \mathrm{Var}(\mathcal{O}, I, D).$$

**Corollary 4.** *Any morphism of* **SOP**Clif *eventually reduces to a morphism of the form given in Lem. 5.*

**Lemma 6.** *Any morphism* <sup>t</sup> *of* **<sup>S</sup>OP**Clif *such that* <sup>t</sup> = 0 *reduces to:*

$$\sum\_{y\_0} e^{2i\pi \left(\frac{y\_0}{2}\right)} |0, \dots, 0\rangle ! \{\} \{0, \dots, 0\}$$

**Corollary 5.** *If* t ∈ **SOP**Clif *is terminal with* Var(P) ⊆ Var(*O*, *D*, *I*)*, then* <sup>t</sup> = 0*.*

**Definition 8.** *We define* **SOP**Clif *as the set of morphisms of* **SOP**Clif *in the form given in Lem. 5. We define the function* F *on* **SOP**Clif *such that, for any morphism* <sup>t</sup> <sup>=</sup> <sup>1</sup> √2 p *y*,*y*<sup>d</sup> <sup>e</sup><sup>2</sup>iπP (*y*,*y*d) <sup>|</sup>*O*(*y*, *<sup>y</sup>*d)!{*y*<sup>d</sup>} *I*(*y*, *<sup>y</sup>*d)<sup>|</sup> *of* **<sup>S</sup>OP**Clif*:* <sup>F</sup>(t) := <sup>1</sup> √2 2p e<sup>2</sup>iπ(<sup>P</sup> (*y*,*y*d)-<sup>P</sup> (*y*- ,*y*d)) <sup>|</sup>*O*(*y*, *<sup>y</sup>*d), *<sup>O</sup>*(*y* , *y*d)*I*(*y*, *y*d), *I*(*y* , *y*d)|

$$
\sqrt{2} \quad y, y', y\_d
$$

This new functor F can be seen as a simplified CPM construction, applicable only for terms that are already simplified (in the form of Lem. 5).

**Proposition 13.** *For any* t ∈ **SOP**Clif*,* F(t) ∼ Clif+ CPM(t)*. This implies* -<sup>F</sup>(.) <sup>=</sup> -CPM(.)*.*

**Definition 9.** *We define a function* G *on some morphisms of* **SOP**Clif *that have an appropriate form. Let* t = <sup>1</sup> <sup>√</sup><sup>2</sup> 2p *<sup>y</sup>* <sup>e</sup><sup>2</sup>iπP <sup>|</sup>*O*1, *<sup>O</sup>*<sup>2</sup> *<sup>I</sup>*1, *<sup>I</sup>*2<sup>|</sup> *with* <sup>|</sup>*O*1<sup>|</sup> <sup>=</sup> <sup>|</sup>*O*2<sup>|</sup> *and* |*I*1| = |*I*2|*. Let us partition y into:* {*y*<sup>d</sup>} := {*y*} \ Var(*O*<sup>1</sup> ⊕ *O*2, *I*<sup>1</sup> ⊕ *I*2)*,* {*y*1} := Var(*O*1, *I*1) \ {*y*<sup>d</sup>} *and* {*y*2} := ({*y*}\{*y*1}) \ {*y*<sup>d</sup>}*. If* |*y*1| = |*y*2| *and if there exists a unique bijection* δ : {*y*2}→{*y*1} *such that:* (*O*<sup>1</sup> ⊕ *O*2, *I*<sup>1</sup> ⊕ *I*2)[*y*<sup>2</sup> ← δ(*y*2)] = **0***, then* G(t) *is defined, and:*

$$G(t) := \frac{1}{\sqrt{2}^p} \sum\_{\mathbf{y}\_1, \mathbf{y}\_d} e^{-2i\pi P[\mathbf{y}\_1 \leftarrow \mathbf{0}][\mathbf{y}\_2 \leftarrow \delta(\mathbf{y}\_2)]} \Big( |O\_2\rangle ! \{y\_d\} \,\biglangle\, I\_2| \, |\, \mathbf{y}\_1 \leftarrow \mathbf{0}|[\mathbf{y}\_2 \leftarrow \delta(\mathbf{y}\_2)] \Big)$$

The function G is designed to be an inverse of F for morphisms where it is defined, while at the same being impervious to some rewrite rules.

**Proposition 14.** *Let* t *be terminal with* −→ Clif *, and* t *such that* <sup>F</sup>(t) <sup>∗</sup> −→ Clif + t *. Then,* G(F(t)) *and* G(t ) *exist, and* G(F(t)) = G(t )*.*

**Theorem 3.** *Let* <sup>t</sup><sup>1</sup> *and* <sup>t</sup><sup>2</sup> *be two morphisms of* **<sup>S</sup>OP**Clif *such that* <sup>t</sup><sup>1</sup> <sup>=</sup> t2*. If* t <sup>1</sup> *and* t <sup>2</sup> *are terminal such that* t<sup>1</sup> <sup>∗</sup> −→ Clif t <sup>1</sup> *and* t<sup>2</sup> <sup>∗</sup> −→ Clif t <sup>2</sup>*, then* t <sup>1</sup> = t <sup>2</sup> *up to* α*-conversion.*

*Remark 2.* Interestingly, the previous proposition and theorem show that the simplification of a term of **SOP**Clif can be operated in the "pure" setting, and then G can be used to retrieve the normal form.

**Corollary 6.** *The equality of morphisms in* **<sup>S</sup>OP**Clif/ <sup>∼</sup>Clif *is decidable in time polynomial in the size of the phase polynomial and in the combined size of the ket/bra/discarded polynomials.*

### **References**


2007). https://doi.org/10.1016/j.entcs.2006.12.018, https://doi.org/10.1016%2Fj. entcs.2006.12.018


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### **A Quantified Coalgebraic van Benthem Theorem**

Paul Wild (-) and Lutz Schr¨oder 

Friedrich-Alexander-Universit¨at Erlangen-N¨urnberg, Erlangen, Germany {paul.wild,lutz.schroeder}@fau.de

**Abstract.** The classical van Benthem theorem characterizes modal logic as the bisimulation-invariant fragment of first-order logic; put differently, modal logic is as expressive as full first-order logic on bisimulationinvariant properties. This result has recently been extended to two flavours of quantitative modal logic, viz. fuzzy modal logic and probabilistic modal logic. In both cases, the quantitative van Benthem theorem states that every formula in the respective quantitative variant of first-order logic that is bisimulation-invariant, in the sense of being nonexpansive w.r.t. behavioural distance, can be approximated by quantitative modal formulae of bounded rank. In the present paper, we unify and generalize these results in three directions: We lift them to full coalgebraic generality, thus covering a wide range of system types including, besides fuzzy and probabilistic transition systems as in the existing examples, e.g. also metric transition systems; and we generalize from real-valued to quantale-valued behavioural distances, e.g. nondeterministic behavioural distances on metric transition systems; and we remove the symmetry assumption on behavioural distances, thus covering also quantitative notions of simulation.

**Keywords:** Modal logic · Quantale · Fuzzy logic · Coalgebra · Behavioural distance · Modal characterization.

### **1 Introduction**

Modal logic takes part of its popularity from the fact that it specifies transition systems at what for many purposes may be regarded as the right level of granularity; that is, it is invariant under the standard process-theoretic notion of *bisimulation* in the sense that bisimilar states satisfy the same modal formulae. There are two quite different well-known converses to this elementary property, which both witness the *expressiveness* of modal logic: By the *Hennessy-Milner theorem* [29], states in finitely branching systems that satisfy the same modal formulae are bisimilar, and by the *van Benthem theorem*, every first-order definable bisimulation-invariant property is expressible by a modal formula. Since modal logic embeds into first-order logic, the latter result may be phrased as saying that modal logic is the bisimulation-invariant fragment of first-order logic.

 Work of both authors forms part of the DFG project Probabilistic description logics as a fragment of probabilistic first-order logic (SCHR 1118/6-2)

c The Author(s) 2021

S. Kiefer and C. Tasson (Eds.): FOSSACS 2021, LNCS 12650, pp. 551–571, 2021. https://doi.org/10.1007/978-3-030-71995-1 28

In the two-valued setting, there has been increased recent interest in variants and generalizations of this result (e.g. [54,14,52,22,55,1])

For quantitative systems, it has long been realized (e.g. [26,15,10]) that quantitative notions of process equivalence, generally referred to as *behavioural metrics* (although they are in general only *pseudo*metrics, as distinct but equivalent states have distance zero), are often more appropriate than two-valued bisimilarity. In particular, while two-valued notions of process equivalence just flag small deviations between systems as inequivalence, behavioural metrics can provide more fine-grained information on the degree of similarity of systems. Behavioural metrics are correspondingly used, e.g., in verification [25], differential privacy [13], and conformance testing of hybrid systems [36].

In the same way that two-valued modal logic constitutes a natural specification language for two-valued transition systems, quantitative systems correlate to quantitative modal logics. In this context, bisimulation invariance is read as *nonexpansiveness* w.r.t. behavioural distance, i.e. two states differ on a modal formula at most by their behavioural distance; we refer to this property as *behavioural nonexpansiveness*. Notably, van Breugel and Worrell [10] prove a Hennessy-Milner type theorem for a quantitative probabilistic modal logic: They show that on compact state spaces, the formulae of the logic lie dense in the space of behaviourally nonexpansive state properties, which implies that behavioural distance and logical distance coincide.

In the present paper, we are mainly interested in the other converse to behavioural nonexpansiveness, i.e. in *quantitative van Benthem theorems*. In previous work with Pattinson and K¨onig, we have established such theorems for quantitative modal logics of fuzzy [57] and probabilistic [58] transition systems. In the quantitative setting, these theorems take the form of approximability properties, and state that every behaviourally nonexpansive quantitative firstorder property is approximable by quantitative modal formulae *of bounded rank*. The latter qualification is in fact the key content of the respective theorems – without it, approximability is closer in flavour to Hennessy-Milner-type theorems, which apply to arbitrary rather than just first-order definable properties (although one should note additionally that our van Benthem theorems do not assume compactness of the state space).

Our present contribution is to unify and generalize these results in three directions: First, we allow for full *coalgebraic generality*, i.e. we cover system types subsumed under the paradigm of *universal coalgebra* [49]. Besides the fuzzy and probabilistic systems featuring in the previous concrete instances of our result, this includes a wide range of weighted, game-based, and preferential systems; for illustration, we concentrate on the (comparatively simple) case of *metric transition systems* [3,20] in the presentation. Second, we generalize from real-valued to *quantale-valued* metrics (e.g. [24,33]). Using the unit interval quantale, we recover our previous results on real-valued logics as special cases. Beyond this, quantales in particular provide support for what may be termed *metrics with effects*; we illustrate this on a notion of *convex-nondeterministic behavioural distance* on metric transition systems, where the behavioural distance gives an interval of possible real-valued distances. Lastly, we remove the assumption that distances need to be symmetric, so that we cover also notions of quantitative simulation. At this level of generality, we prove both a Hennessy-Milner type theorem stating coincidence of logical and behavioural distance, effectively generalizing the existing coalgebraic quantitative Hennessy-Milner theorem [37] to quantale-valued distances; and, as our main result, a quantitative van Benthem theorem stating that all behaviourally non-expansive first-order properties can be modally approximated in bounded rank.

*Related Work* There is a substantial body of work on two-valued modal characterization theorems, e.g. for logics with frame conditions [14], coalgebraic modal logics [52], fragments of XPath [12,22,1], neighbourhood logic [28], modal logic with team semantics [38], modal μ-calculi (within monadic second order logics) [35,19], PDL (within weak chain logic) [11], modal first-order logics [6,54], and two-dimensional modal logics with an S5-modality [55]. We are not aware of quantitative modal characterization theorems other than the mentioned ones for fuzzy and probabilistic modal logics [57,58]. Prior to the quantitative Hennessy-Milner theorems mentioned above [10,37], Hennessy-Milner theorems have been established for *two-valued* logics and two-valued bisimilarity over quantitative systems, e.g. on probabilistic transition systems [39,16,17]. There is work on Hennessy-Milner theorems for certain Heyting-valued modal logics [21,18]; since Heyting algebras are quantales but often fail to meet a continuity assumption needed in our generic Hennessy-Milner theorem, we do not claim to subsume these results.

#### **2 Preliminaries**

We briefly recall basic definitions and examples on quantales and universal coalgebra, and fix some data needed throughout the paper. We need some elementary category theory, see, e.g., [2].

**Quantales** are order-algebraic structures that serve as objects of truth values in suitable multi-valued logics, and also support a useful notion of generalized (pseudo-)metric space (e.g. [24,33,32]). Our arguments will rely on a certain amount of epsilontics, and hence require more specifically the use of *value quantales* [24].

We recall some basic order and lattice theory. A *complete lattice* is a partially ordered set (V, ≤) having all suprema A for A ⊆ V , equivalently all infima A. We denote binary meets and joins by ∧ and ∨, respectively. Given x, y ∈ V , we say that x is *well above* y, and write x 8 y, if whenever y ≥ A for some A ⊆ V , then x ≥ a for some a ∈ A. A complete lattice (V, ≤) is *completely distributive* if all joins in V distribute over all meets, equivalently all meets distribute over all joins [46]. Another equivalent characterization is that (V, ≤) is completely distributive iff

$$y = \bigwedge \{ x \in V \mid x \gg y \} $$

for every y ∈ V [47].

In the definition of value quantale, we follow Flagg [24] in dualizing the usual continuity condition for quantales in order to avoid having to reverse the order when moving between the general development and basic examples such as the unit interval; deviating from his terminology, we emphasize this by the prefix 'co-':

**Definition 2.1 ((Value) co-quantales).** A *(commutative) co-quantale* V is a complete lattice (V, ≤) equipped with a commutative monoid structure (0, ⊕) that is *meet-continuous*:

$$a \oplus \bigwedge\_{i \in I} b\_i = \bigwedge\_{i \in I} (a \oplus b\_i).$$

A co-quantale V is a *value co-quantale* [24] if 0 is the bottom element of V and moreover (V, ≤) is a *value distributive lattice*, i.e. a completely distributive complete lattice such that |V | > 1 and for all x, y ∈ V , x, y 8 0 implies x∧y 8 0. Correspondingly, we denote the greatest element of V by 1.

(Dually, in a *quantale* the operation ⊕ is required to be *join-continuous*.) By meet-continuity, we obtain a further binary operator 9 on a co-quantale V by adjunction, defined by

$$a \ominus b \le v \quad \text{iff} \quad a \le b \oplus v$$

(equivalently, a 9 b = {v | a ≤ b ⊕ v}). The operator 9 is sometimes called the *internal hom* of V [7]. Moreover, in a value co-quantale, we have that for each ε 8 0, there exists δ 8 0 such that 2 · δ := δ ⊕ δ ≤ ε [24, Theorem 2.9]. This allows for proofs where an error bound ε 8 0 needs to be split up into multiple smaller parts.

A simple example of a value co-quantale is the unit interval [0, 1] with the usual ordering, with truncated addition a ⊕ b = min(a + b, 1) as the monoid structure. Correspondingly, the 9 operation is truncated subtraction a 9 b = max(a−b, 0). We have a 8 b iff a>b. We will give further examples in Section 3.

**Universal Coalgebra** serves as a unified framework for many types of statebased systems [49], such as nondeterministic, probabilistic, alternating, gamebased, or weighted systems. It is based on encapsulating the system type as a *functor* T, for our purposes on the category Set of sets and functions; such a T assigns to each set X a set T X, thought of as a type of structured collections over X, and to each map f : X → Y a map T f : T X → T Y , respecting identities and composition. A T*-coalgebra* (A, α) consists of a set A of *states* and a *transition map* α: A → T A, thought of as assigning to each state a structured collection of successors. Taking T to be the *covariant powerset functor* P, which assigns to each set X its powerset PX, we obtain relational transition systems as T-coalgebras. As a further example, the *(discrete) subdistribution functor* S assigns to each set X the set SX of discrete probability subdistributions μ on X (i.e. μ(X0) = μ(X) ≤ 1 for some countable subset X<sup>0</sup> ⊆ X), and to each map <sup>f</sup> : <sup>X</sup> <sup>→</sup> <sup>Y</sup> the image measure function (i.e. <sup>S</sup>f(μ)(B) = <sup>μ</sup>(<sup>f</sup> <sup>−</sup><sup>1</sup>[B]) for <sup>B</sup> <sup>⊆</sup> <sup>Y</sup> ). S-coalgebras are probabilistic transition systems (or Markov chains) with possible deadlock: They assign to each state a subdistribution over possible successor states, with the gap of the total probability to 1 interpreted as the probability of deadlock. Additional instances are seen in Example 4.4. For the remainder of the paper, we *fix a set functor* T *and require that* T∅ *is nonempty* (hence our use of subdistributions instead of distributions in the examples). Moreover, we require w.l.o.g. that T is *standard*, i.e. preserves subset inclusions [5].

### **3 Quantale-Valued Distances and Lax Extensions**

A V*-valued relation* between sets A and B is a map R: A × B → V , which we also denote by R: A →+ B. For fixed A and B, we order the V-valued relations between A and B pointwise: R<sup>1</sup> ≤ R<sup>2</sup> ⇐⇒ ∀a ∈ A, b ∈ B. R1(a, b) ≤ R2(a, b). We compose relations R: A→+ B and S : B→+ C using the monoid operation on V:

$$(R;S)(a,c) = \bigwedge \{ R(a,b) \oplus S(b,c) \mid b \in B \}.$$

Given a function f : A → B and ε ∈ V , the ε*-graph* Grε,f is the relation

$$\text{Gr}\_{\varepsilon,f}(a,b) = \begin{cases} \varepsilon, & \text{if } f(a) = b; \\ 1, & \text{otherwise.} \end{cases}$$

We also write Gr<sup>f</sup> = Gr0,f and, in case of the identity function, Δε,X = Grε,id<sup>X</sup> and Δ<sup>X</sup> = Δ0,X.

**Definition 3.1 (**V**-continuity space).** Let X be a set and let d: X →+ X. The pair (X, d) is a V*-continuity space* [24] if d ≤ Δ<sup>X</sup> and d ≤ d; d, or equivalently, if for all x, y, z ∈ X,

$$d(x,x) = \mathbf{0} \qquad \text{and} \qquad d(x,z) \le d(x,y) \oplus d(y,z).$$

The *dual* of (X, d) is the V-continuity space (X, d∗) where d∗(x, y) = d(y, x). The *symmetrization* of (X, d) is the space (X, d<sup>s</sup>) with <sup>d</sup><sup>s</sup>(x, y) = <sup>d</sup>(x, y) <sup>∨</sup> <sup>d</sup>∗(x, y). We say that (X, d) is *symmetric* if d = d∗.

**Remark 3.2.** Recall that omission of the metric symmetry axiom d(x, y) = d(y, x) is standardly designated by the prefix 'quasi-' and omission of the antisymmetry axiom d(x, y) = 0 ⇒ x = y by the prefix 'pseudo-'; thus, continuity spaces could be termed *generalized pseudo-quasimetric spaces*, and symmetric continuity spaces *generalized pseudometric spaces*.

The co-quantale V itself is made into a V-continuity space (V, d<sup>V</sup> ) using the operator 9:

$$d\_{\mathcal{V}}(a,b) = a \ominus b.$$

For any set A, the *supremum distance* between V-valued maps f,g : A → V is

$$d^\vee\_\mathcal{V}(f,g) = \bigvee\_{a \in A} d\_\mathcal{V}(f(a), g(a)).$$

The usual notion of nonexpansive map generalizes as expected:

**Definition 3.3 (Nonexpansive maps).** A map f : X → Y between Vcontinuity spaces (X, d1) and (Y, d2) is *nonexpansive* if d2(f(x), f(y)) ≤ d1(x, y) for all x, y ∈ X. We denote the space of nonexpansive maps between (X, d1) and (Y, d2) by (X, d1) →<sup>1</sup> (Y, d2). In the special case of nonexpansive V-valued maps we write Pred(X, d)=(X, d) →<sup>1</sup> (V, d<sup>V</sup> ).

Ultimately we are interested in defining and reasoning about *behavioural distances*. Generally speaking, a behavioural distance is a V-continuity space defined on the carrier of a T-coalgebra α: A → T A in such a way that the behaviour defined by the coalgebra map α is incorporated into the distance values of states in A. This is accomplished using *relation liftings*, which lift V-valued relations giving distances between states to those giving distances between successor structures of states. We specifically generalize the notion of nonexpansive lax extension [56] to the quantale-valued case:

**Definition 3.4 (Lax Extension).** A *nonexpansive lax extension* of T is a mapping L that maps V-valued relations R: A×B → V to relations LR: T A×T B → V and satisfies the following axioms:

> (L1) R<sup>1</sup> ≤ R<sup>2</sup> =⇒ LR<sup>1</sup> ≤ LR<sup>2</sup> (L2) L(R; S) ≤ LR;LS (L3) LGr<sup>f</sup> ≤ GrT f (L4) LΔε,A ≤ Δε,T A

for all R, R1, R<sup>2</sup> : A →+ B,S : B →+ C, f : A → B and ε ∈ V .

(The notion of *lax extension*, given by axioms (L1)–(L3), is standard, e.g. [31]; the axiom (L4), introduced in [56], guarantees nonexpansiveness w.r.t. the supremum metric as shown in Lemma 3.6.)

**Lemma 3.5.** *If* L *is a lax extension of* T *and* (A, d) *is a* V*-continuity space, then so is* (T A, Ld)*.*

**Lemma 3.6.** *If* L *is a nonexpansive lax extension of* T*, then* L *is in fact nonexpansive w.r.t. the supremum metric. That is, for* R1, R<sup>2</sup> : A →+ B *we have* d<sup>∨</sup> <sup>V</sup> (LR1, LR2) <sup>≤</sup> <sup>d</sup><sup>∨</sup> <sup>V</sup> (R1, R2)*.*

*Proof.* We have d<sup>∨</sup> <sup>V</sup> (R1, R2) <sup>≤</sup> <sup>ε</sup> ⇐⇒ <sup>R</sup><sup>1</sup> <sup>≤</sup> <sup>R</sup>2; <sup>Δ</sup>ε. Using (L1), (L2) and (L4), we have LR<sup>1</sup> ≤ L(R2; Δε) ≤ LR2;LΔ<sup>ε</sup> ≤ LR2; Δε, so d<sup>∨</sup> <sup>V</sup> (LR1, LR2) <sup>≤</sup> <sup>ε</sup>.

For technical purposes, we will be interested in a generalized version of total boundedness (recall that a standard metric space is compact iff it is complete and totally bounded):

**Definition 3.7 (Total boundedness).** Let (X, d) be a V-continuity space. For <sup>ε</sup> <sup>8</sup> <sup>0</sup>, we write <sup>B</sup><sup>s</sup> <sup>ε</sup> (x) = {<sup>y</sup> <sup>∈</sup> <sup>X</sup> <sup>|</sup> <sup>d</sup><sup>s</sup>(x, y) <sup>≤</sup> <sup>ε</sup>} for the *(symmetric) ball* of radius ε around x ∈ X. A *finite* ε*-cover* of (X, d) is a choice of finitely many x1,...,x<sup>n</sup> ∈ X such that X = n <sup>i</sup>=1 <sup>B</sup><sup>s</sup> <sup>ε</sup> (xi). We say that (X, d) is *totally bounded* if X has a finite ε-cover for each ε 8 0.

**Remark 3.8.** Note that use of the symmetrization d<sup>s</sup> is essential in the above definition; e.g. in the unit interval, with d(x, y) = x 9 y, the set {y | d(0, y) ≤ ε} is the whole space, so 0 alone would form an ε-cover of [0, 1] if we replaced d<sup>s</sup> with d.

Moreover, our main result involves a generalization of the standard notion of density:

**Definition 3.9 (Density).** Let (X, d) be a V-continuity space. A subset Y ⊆ X is *dense* if for every <sup>x</sup> <sup>∈</sup> <sup>X</sup> and <sup>ε</sup> <sup>8</sup> <sup>0</sup> there exists <sup>y</sup> <sup>∈</sup> <sup>Y</sup> such that <sup>d</sup><sup>s</sup>(x, y) <sup>≤</sup> <sup>ε</sup>.

**Assumption 3.10.** Throughout the paper, we *fix a value co-quantale* V *that is totally bounded as a* V*-continuity space*. Moreover, we fix a dense subset V<sup>0</sup> ⊆ V for use as a set of truth constants in the relevant logics, with a view to keeping the syntax countable in the central examples. (The technical development, on the other hand, does not require V<sup>0</sup> to be countable, so we can always take V<sup>0</sup> = V .)

#### **Example 3.11 ((Value) co-quantales).**

1. The set 2 = {0, 1}, with 0 ≤ 1 and with binary join as the monoid structure, is a value co-quantale [24], and of course totally bounded. 2-Continuities d are just preorders, with y being above x if d(x, y) = 0 (!); symmetric 2 continuities are equivalence relations. Notice that 0 8 0 in 2. The 9 operator is given by a 9 b = 1 iff a = 1 and b = 0.

2. The dual of every *locale* (e.g. [8]), in particular the set of closed subsets of any topological space, forms a co-quantale, with binary join as the monoid structure. However, locales are not in general value co-quantales. The dual Ω(R) of the *free* locale over a set R, described as the lattice of downclosed systems of finite subsets of R (ordered by reverse inclusion of such set systems), does form a value co-quantale [24], and is totally bounded [30]. Ω(R)-continuity spaces are known as *structure spaces* [30,24].

3. The unit interval [0, 1] is totally bounded. [0, 1]-Continuity spaces coincide with 1-bounded pseudo-quasimetric spaces, and symmetric [0, 1]-continuity spaces with 1-bounded pseudometric spaces in the standard sense (cf. Remark 3.2).

4. *Convex-nondeterministic distances:* The set I of nonempty closed subintervals (i.e. finitely generated nonempty convex subsets) of [0, 1], written in the form [a, b] with a ≤ b, ordered by [a, b] ≤ [c, d] iff a ≤ c and b ≤ d, and equipped with truncated Minkowski addition [a, b]⊕[c, d]=[a⊕c, b⊕d] (with ⊕ on [0, 1] defined as in the previous item), is a totally bounded value co-quantale. We write [a, b]=[a, max(a, b)]. We have [a, b] 8 0 = [0, 0] iff a > 0, and [a, b] 9 [c, d]=[a 9 c, b 9 d], again with 9 on [0, 1] described as in the previous item. We can think of an I-continuity space as assigning to each pair of points a nondeterministic distance, given as an interval of possible distances.

### **4 Quantale-Valued Modal and Predicate Logics**

We next introduce the main objects of study, quantale-valued coalgebraic modal and predicate logics. They will feature modalities interpreted using a quantitative version of *predicate liftings* [45,50,51]. Predicate liftings take their name from the fact that they lift predicates on a base set X to predicates on the set T X (where T is our globally fixed functor representing the system type according to Section 2). We work with V*-valued predicates*, which are organized in the *contravariant* V*-powerset* functor Q given on sets X by QX = X → V and on functions <sup>f</sup> : <sup>X</sup> <sup>→</sup> <sup>Y</sup> by <sup>Q</sup>f(g) = <sup>g</sup> ◦ <sup>f</sup> (that is, <sup>Q</sup> is a functor Setop <sup>→</sup> Set where Setop is the opposite category of Set). In keeping with the prevalent reading in fuzzy and probabilistic logics (where, typically, V = [0, 1]), we read 0 ∈ V as 'false' and 1 ∈ V as 'true' (opposite choices are also found in the literature, e.g. in modal logics for metric transition systems [3], where 0 ∈ [0, 1] is interpreted as 'true'). Predicate liftings can have arbitrary finite arities [50]. For brevity, we restrict the presentation to unary modalities and predicate liftings; generalizing to higher arities requires only more indexing.

**Definition 4.1.** A *(*V*-valued) predicate lifting* is a natural transformation λ: Q→Q◦ T, i.e. a family of maps λ<sup>X</sup> : QX → QT X, indexed over all sets X, such that λ<sup>Y</sup> (f)(T h(t)) = λX(f ◦ h)(t) for all f : Y → V , h: X → Y , t ∈ T X.

**Definition 4.2.** Let λ be a predicate lifting.


For the remainder of the paper, we *fix a set* Λ *of monotone and nonexpansive predicate liftings*, which, by abuse of notation, we also use as modalities in the syntax. A basic example is the ♦ modality of quantitative probabilistic modal logic [10], which denotes expected probability (in the next transition step) and corresponds to a predicate lifting for the (sub-)distribution functor S (Section 2); see Example 4.4.2 for details. The generic **syntax** of *(*V*-valued) quantitative coalgebraic modal logic* is then given by the grammar

$$
\varphi, \psi ::= c \mid \varphi \oplus c \mid \varphi \ominus c \mid \varphi \land \psi \mid \varphi \lor \psi \mid \lambda \varphi \qquad (c \in V\_0, \lambda \in A).
$$

The operators ⊕, 9, ∨, ∧ denote co-quantale operations, the meaning of λ is determined by the associated predicate lifting. As usual, the *rank* of a formula ϕ is the maximal nesting depth of modalities λ in ϕ. We denote the set of all modal formulae by <sup>L</sup><sup>Λ</sup> and the set of formulae of rank at most <sup>n</sup> by <sup>L</sup><sup>Λ</sup> n .

Formally, the **semantics** is defined by assigning to each formula ϕ and each <sup>T</sup>-coalgebra <sup>α</sup>: <sup>A</sup> <sup>→</sup> T A the *extension* <sup>ϕ</sup><sup>α</sup> : <sup>A</sup> <sup>→</sup> <sup>V</sup> , or just <sup>ϕ</sup>, of <sup>ϕ</sup> over <sup>α</sup>, recursively defined by

$$\begin{aligned} \{\varphi \oplus c\}(a) &= \{\varphi\}(a) \oplus c & \quad \{\varphi \ominus c\}(a) = \{\varphi\}(a) \ominus c\\ \{\varphi \land \psi\}(a) &= \{\varphi\}(a) \land \{\psi\}(a) & \quad \{\varphi \lor \psi\}(a) = \{\varphi\}(a) \lor \{\psi\}(a) \\ \{c\}(a) &= c & \quad \{\lambda\varphi\}(a) = \lambda\_A(\[\varphi\])(\alpha(a)) \end{aligned}$$

**Remark 4.3.** Fuzzy logics differ widely in their interpretation of propositional connectives (e.g [41]). In our modal syntax, we necessarily restrict to nonexpansive operations, in order to ensure nonexpansiveness w.r.t. behavioural distance later; this is typical of characteristic logics for behavioural distances (such as quantitative probabilistic modal logic [10]). The logic hence does not include binary ⊕ or 9 (in the above syntax, we insist that one of the arguments is a constant). In terminology usually applied to V = [0, 1], we thus allow *Zadeh* connectives (such as ∨, ∧) but not *Lukasiewicz* connectives, so for V = [0, 1], the above version of quantitative coalgebraic modal logic is essentially the Zadeh fragment of Lukasiewicz fuzzy coalgebraic modal logic [51].

The syntax does not include negation 1 9 (−); if V satisfies the De Morgan laws (e.g. these hold in [0, 1]), Λ is closed under *duals* 1 9 (λ(1 9 (−))), and V<sup>0</sup> is closed under negation (i.e. c ∈ V<sup>0</sup> implies 1 9 c ∈ V0), then negation can be defined via negation normal forms as usual.

As the ambient predicate logic of the above modal logic, we use *(*V*-valued) quantitative coalgebraic predicate logic*, a quantitative variant of two-valued coalgebraic predicate logic [40]. Its **syntax** is given by

$$\varphi, \psi ::= c \mid x = y \mid \varphi \oplus c \mid \varphi \ominus c \mid \varphi \land \psi \mid \varphi \lor \psi \mid \exists x. \varphi \mid \forall x. \varphi \mid x \lambda \lceil y \colon \varphi \rceil$$

where c ∈ V0, λ ∈ Λ, and x, y come from a fixed supply Var of (individual) variables. The reading of xλ:y : ϕ; is the modalized truth degree (according to λ) to which the successors y of a state x satisfy ϕ; e.g. with ♦ as above, <sup>x</sup>♦:<sup>y</sup> : <sup>ϕ</sup>; is the expected truth value of <sup>ϕ</sup> at a random successor <sup>y</sup> of <sup>x</sup>. The **semantics** over (A, α) as above is given by <sup>V</sup>-valued maps <sup>ϕ</sup><sup>α</sup>, or just <sup>ϕ</sup>, that are defined on valuations κ: Var → A. The interesting clauses in the definition are

$$\begin{aligned} \left[\exists x.\varphi\right](\kappa) &= \bigvee\_{a\in A} \left[\varphi\right](\kappa[x\mapsto a]) & \left[\forall x.\varphi\right](\kappa) &= \bigwedge\_{a\in A} \left[\varphi\right](\kappa[x\mapsto a])\\ \left[x\lambda\left[y\colon\varphi\right]\right](\kappa) &= \lambda\_A(\left[\varphi\right](\kappa[y\mapsto \cdot]))(\alpha(\kappa(x))) \end{aligned}$$

(where <sup>κ</sup>[<sup>y</sup> <sup>→</sup> <sup>a</sup>] maps <sup>y</sup> to <sup>a</sup> and otherwise behaves like <sup>κ</sup>, and by <sup>ϕ</sup>(κ[<sup>y</sup> → ·]) we mean the predicate that maps <sup>a</sup> to <sup>ϕ</sup>(κ[<sup>y</sup> <sup>→</sup> <sup>a</sup>])). Moreover, equality is crisp, i.e. <sup>x</sup> <sup>=</sup> <sup>y</sup>(κ) is <sup>1</sup> if <sup>κ</sup>(x) = <sup>κ</sup>(y), and <sup>0</sup> otherwise.

**Example 4.4.** We discuss some instances of the above framework.

1. *Fuzzy modal logic:* Take T to be the *covariant* V-valued powerset functor, i.e. T X = X → V and T f(A)(y) = {A(x) | f(x) = y} for f : X → Y . We think of A ∈ T X as a V-valued fuzzy subset of X; we say that A is *crisp* if A(x) ∈ {0, 1} for all <sup>x</sup>. Put <sup>Λ</sup> <sup>=</sup> {♦} where ♦X(A)(B) = {A(x)∧B(x) <sup>|</sup> <sup>x</sup> <sup>∈</sup> <sup>X</sup>} for <sup>A</sup> ∈ QX, B ∈ T X. Then T-coalgebras are equivalent to fuzzy Kripke frames, which consist of a set <sup>X</sup> and a fuzzy relation <sup>R</sup>: <sup>X</sup> <sup>×</sup><sup>X</sup> <sup>→</sup> <sup>V</sup> , and ♦ is the natural fuzzification of the standard diamond modality. Fuzzy propositional atoms from a set At can be added by passing to the functor that maps a set X to Q(At) × T X. Instantiating to V = [0, 1], we obtain a basic modal logic of fuzzy relations, or in description logic terminology *Zadeh fuzzy* ALC [53]. The corresponding instance of quantitative coalgebraic predicate logic is essentially the Zadeh fragment of Novak's Lukasiewicz fuzzy first order logic [43].

2. *Probabilistic modal logic:* As indicated in Section 2, coalgebras for the subdistribution functor S are probabilistic transition systems (with possible deadlock). We take <sup>V</sup> = [0, 1] and <sup>Λ</sup> <sup>=</sup> {♦}, interpreted by the predicate lifting

$$\Diamond\_X(A)(\mu) = \mathbb{E}\_{\mu}(A) \qquad \text{for } \mu \in \mathcal{S}X$$

where Eμ(A) denotes the expected value of A(x) when x is distributed according to μ. The induced instance of quantitative coalgebraic modal logic is *(quantitative) probabilistic modal logic* [10], which may be seen as a quantitative variant of two-valued probabilistic modal logic [39], and embeds into the probabilistic μcalculus [34,42]. Propositional atoms are treated analogously as in the previous item (and indeed probabilistic modal logic is trivial without them). The ambient quantitative probabilistic first-order logic arising as the corresponding instance of quantitative coalgebraic predicate logic is a quantitative variant of Halpern's type-1 (i.e. statistical) probabilistic first-order logic [27].

3. *Metric modal logic:* In their simplest form, *metric transition systems* [3] are just transition systems in which states are labelled in a metric space S (numerous variants exist, e.g. with states themselves forming a metric space or with transitions labelled in a metric space [9]). We work with a generalized version where (S, dS) is a V-continuity space. Metric transition systems are then coalgebras for the functor T X given on sets by T X = S × PX. We take Λ = {♦} ∪ <sup>S</sup>. We interpret <sup>Λ</sup> using predicate liftings

$$\Diamond\_X(A)(s,B) = \bigvee\{A(x) \mid x \in B\} \qquad r\_X(A)(s,B) = d\_S(s,r)$$

for A ∈ QX, (s, B) ∈ T X, r ∈ S. Note that r ∈ S ignores its argument A, so is effectively a nullary modality. Note also that as per our interpretation of truth values, this nullary modality is read as distinctness from r; in case V = [0, 1], the degree of equality to r can be expressed as 1 9 r. The induced instance of coalgebraic modal logic is related to characteristic logics for branching-time behavioural distances on metric transition systems [3,9].

4. *Convex-nondeterministic metric modal logic:* We continue to consider metric transition systems as recalled in the previous item, reusing the designators T, S, dS, and taking V = [0, 1] for simplicity. Recall the value co-quantale I of nonempty closed subintervals of [0, 1] from Example 3.11.4. We turn the predicate liftings for r ∈ S defined in the previous item into I-valued predicate liftings by prolonging them along the inclusion ι: [0, 1] %→ I, given by ι(a)=[a, a]. We define an I-valued predicate lifting M for T, where I is the value quantale of closed intervals introduced in Example 3.11.4, by

$$M\_X(A)(s,B) = \left[ \bigwedge \{ \pi\_1(A(x)) \mid x \in B \}, \bigvee \{ \pi\_2(A(x)) \mid x \in B \} \right]$$

where π<sup>i</sup> : I → [0, 1] denote the evident *projections* π1([a, b]) = a, π2([a, b]) = b. That is, M returns the range of truth values that A takes on B.

### **5 Behavioural Distance and Quantitative Bisimulation Invariance**

The behavioural distance between states of a coalgebra α: A → T A is defined as a least fixpoint that arises from an iterative process: Initially, at depth 0, all states are thought of as equivalent and their distance is therefore 0. In order to increase the depth of the behavioural distance from n to n + 1, we lift the depth-n distance on A to a the set T A of successor structures. Formally, this is accomplished using the following quantale-valued version of the coalgebraic Kantorovich lifting [4,56]:

### **Definition 5.1 (Kantorovich lifting).** Let A and B be sets and R: A →+ B.


$$K\_A(R)(t\_1, t\_2) = \bigvee \{ \lambda\_A(f)(t\_1) \in \lambda\_B(g)(t\_2) \mid \lambda \in A, (f, g) \text{ } R\text{-nonexpansive} \}.$$

(Here, Λ is the set of modalities fixed in Section 4.) Generalizing [56, Theorem 5.6], we have:

**Lemma 5.2.** *The Kantorovich lifting is a nonexpansive lax extension.*

#### **Example 5.3 (Kantorovich liftings).**

1. For <sup>V</sup> = [0, 1] and <sup>V</sup>-valued *fuzzy modal logic* with <sup>Λ</sup> <sup>=</sup> {♦} (i.e. for simplicity without propositional atoms; cf. Example 4.4.1), the Kantorovich lifting KΛ(R) of a V-valued relation R: X →+ Y coincides with an asymmetric generalized Hausdorff lifting; i.e.

$$K\_A(R)(A,B) = \bigvee\_{x \in X} \bigwedge\_{y \in Y} \left( (A(x) \ominus B(y)) \vee (A(x) \wedge R(x,y)) \right)$$

for A ∈ T X = X → V , B ∈ T Y . (Obtaining a similar description for general V remains an open problem.) In particular, on crisp sets A, B, the symmetrization KΛ(R)<sup>s</sup> is the usual Hausdorff lifting KΛ(R)<sup>s</sup>(A, B) = max( A(x)=1 <sup>B</sup>(y)=<sup>1</sup> R(x, y), B(y)=1 <sup>A</sup>(x)=<sup>1</sup> R(x, y)).

2. For *probabilistic modal logic* (Example 4.4.2), the restriction of K<sup>Λ</sup> to distributions coincides, by definition, with the usual (symmetric) Kantorovich-Wasserstein lifting (e.g. [10]). On subdistributions, one obtains an asymmetric variant, whose symmetrization then coincides with the standard one.

3. For <sup>V</sup>-valued *metric modal logic* (Example 4.4.3), with <sup>Λ</sup> <sup>=</sup> {♦} ∪ <sup>S</sup>, we similarly obtain a V-valued (asymmetric) Hausdorff distance

$$K\_A(R)((s,A),(t,B)) = d(s,t) \lor \bigvee\_{x \in A} \bigwedge\_{y \in B} R(x,y)$$

on (s, A) ∈ T X = S × P(X), (t, B) ∈ T Y , and R: X →+ Y ; a characterization that in this case holds for unrestricted V.

4. *Convex-nondeterministic metric modal logic:* The I-valued Kantorovich lifting induced by the set Λ = {M}∪S of modalities on metric transition systems, with notation as in Examples 3.11.4 and 4.4.4, is given by

$$\begin{aligned} \{K\_A(R)((s,A),(t,B)) = \iota(d(s,t)) \lor \\ \{\bigvee\_{y \in B} \bigwedge\_{x \in A} \pi\_1(R(x,y)), \bigvee\_{x \in A} \bigwedge\_{y \in B} \pi\_2(R(x,y))\} \end{aligned}$$

on (s, A) ∈ T X = S × P(X), (t, B) ∈ T Y , and R: X →+ Y (recall that the π<sup>i</sup> are the projections I → [0, 1], and ι: [0, 1] → I denotes the evident injection).

For purposes of lifting V-continuity structures as relations, nonexpansive pairs can be replaced with the more familiar notion of nonexpansive map:

**Lemma 5.4.** *Let* (A, d) *be a* V*-continuity space and let* (f,g) *be* d*-nonexpansive. Put* h(b) = <sup>a</sup>∈<sup>A</sup> <sup>f</sup>(a) <sup>9</sup> <sup>d</sup>(a, b)*. Then* <sup>f</sup> <sup>≤</sup> <sup>h</sup> <sup>≤</sup> <sup>g</sup> *and* <sup>h</sup> <sup>∈</sup> Pred(A, d)*.*

By monotonicity of predicate liftings we get the following alternative formulation for the Kantorovich lifting of a V-continuity structure:

**Lemma 5.5.** *Let* (A, d) *be* V*-continuity space. Then for all* t1, t<sup>2</sup> ∈ T A

$$K\_A(d)(t\_1, t\_2) = \bigvee \{ \lambda\_A(h)(t\_1) \in \lambda\_A(h)(t\_2) \mid \lambda \in A, h \in \mathsf{Pred}(A, d) \}.$$

Using the Kantorovich lifting, we can now define a sequence of behavioural distances between states a, b in a T-coalgebras α: A → T A, β : B → T B:

$$d\_0^K(a,b) = 0 \quad d\_{n+1}^K(a,b) = K\_A(d\_n^K)(\alpha(a), \beta(b)) \quad d\_\omega^K(a,b) = \bigvee\_{n<\omega} d\_n^K(a,b).$$

By general fixed point theory, the continuation of this ordinal-indexed sequence past ω eventually stabilizes, that is, there exists some ordinal γ such that d<sup>K</sup> <sup>γ</sup>+1 = dK <sup>γ</sup> . The arising least fixed point is the unbounded *behavioural distance* d<sup>K</sup>, alternatively given by

$$d^K = \bigwedge \{ d \mid d = K\_A(d) \circ (\alpha \times \beta) \}.$$

These behavioural distances lead to an appropriate generalization of the notion of bisimulation invariance. A family f of V-valued predicates f<sup>α</sup> indexed over T-coalgebras α: A → T A – such as the extension of a modal formula or of a first-order formula with a single free variable – is said to be *behaviourally nonexpansive* if it is nonexpansive with respect to behavioural distance d<sup>K</sup>, i.e. if for all coalgebras α: A → T A, β : B → T B and all a ∈ A, b ∈ B,

$$f\_{\alpha}(a) \ominus f\_{\beta}(b) \le d^{K}(a, b). \tag{1}$$

Similarly, f is *depth-*n *behaviourally nonexpansive* for finite depth n if f is nonexpansive with respect to depth-n behavioural distance d<sup>K</sup> n .

To match these notions to the classical setting, consider the binary coquantale 2. In the general case, the above notion of behavioural nonexpansiveness should then be thought of as preservation under simulation: States a, b have (asymmetric) distance 0 if b simulates a, and in this case, (1) stipulates that if f is true at a, then f is also true at b.

**Example 5.6.** The behavioural distance arising from the Kantorovich lifting of metric modal logic (Example 5.3.3) is a *simulation distance*. The value d<sup>K</sup>(a, b) quantifies the degree to which traces starting at b simulate traces starting at a, where the distance from one trace to another is the supremum over the distances at all time steps.

On the other hand, there are many cases where the behavioural distance d<sup>K</sup> is symmetric. If V = [0, 1] and the set Λ is closed under duals (Remark 4.3), then we have that KΛ(R∗) = KΛ(R)<sup>∗</sup> for all R and therefore d<sup>K</sup> is symmetric [56]. Concretely, if we put -<sup>X</sup>(A) = <sup>1</sup> <sup>9</sup> ♦X(<sup>1</sup> <sup>9</sup> <sup>A</sup>), then in the case of fuzzy modal logic (Example 4.4.1) we have -<sup>X</sup>(A)(B) = {(1 9 B(x)) ∨ A(x) | x ∈ X} and in the case of probabilistic modal logic (Example 4.4.2) we have -<sup>X</sup>(A)(μ) = <sup>E</sup>μ(A) <sup>⊕</sup> (<sup>1</sup> <sup>9</sup> <sup>μ</sup>(X)), and in both cases <sup>Λ</sup> <sup>=</sup> {♦, -} yields a symmetric distance.

In these symmetric cases distance 0 determines a notion of *bi*similarity, and behavioural nonexpansiveness amounts to the standard notion of bisimulation invariance. Thus, the following straightforward lemma generalizes both bisimulation invariance of modal logic and preservation of positive modal logic (with only diamond modalities) under simulation:

**Lemma 5.7.** *All modal formulae are behaviourally nonexpansive, and all modal formulae of rank at most* n *are depth-*n *behaviourally nonexpansive.*

As expected, coalgebra morphisms preserve behaviour on the nose:

**Lemma 5.8.** *Let* α: A → T A *and* β : B → T B *be coalgebras and* h: A → B *a coalgebra morphism, that is* T h◦<sup>α</sup> <sup>=</sup> <sup>β</sup> ◦h*. Then* <sup>d</sup>K,s(a, h(a)) = <sup>0</sup> *for all* <sup>a</sup> <sup>∈</sup> <sup>A</sup>*.*

Another way to define distances between states of a coalgebra is in terms of the modal formulae:

**Definition 5.9 (Logical distance).** Let a, b be states in coalgebras α: A → T A, β : B → T B. We define

$$d\_n^L(a,b) = \bigvee\{\llbracket\varphi\rrbracket(a) \ominus \llbracket\varphi\rrbracket(b) \mid \varphi \in \mathcal{L}\_n^A\}$$

$$d^L(a,b) = \bigvee\{\llbracket\varphi\rrbracket(a) \ominus \llbracket\varphi\rrbracket(b) \mid \varphi \in \mathcal{L}^A\}$$

The relationship between fixpoint-based distances d<sup>K</sup> and logical distances d<sup>L</sup> is at the heart of the study of behavioural nonexpansiveness and modal expressiveness. For instance, Lemma 5.7 can equivalently be expressed by the inequalities <sup>d</sup><sup>L</sup> <sup>≤</sup> <sup>d</sup><sup>K</sup> and <sup>d</sup><sup>L</sup> <sup>n</sup> <sup>≤</sup> <sup>d</sup><sup>K</sup> <sup>n</sup> ,n<ω. In Section 6, we investigate the converse inequalities.

#### **6 Modal Approximation**

We now establish our first contribution, a quantitative coalgebraic Hennessy-Milner theorem. To this end, we first need to pin down the exact relationship of the two families of distances at finite depth.

**Theorem 6.1.** *Let the set* Λ *of monotone and nonexpansive predicate liftings from Section 4 be finite and let* (A, α) *be a coalgebra. For all* n<ω*:*


**Remark 6.2.** The need for assuming that the set Λ of modalities is finite is specific to quantitative Hennessy-Milner theorems (and implicitly present also in the existing [0, 1]-valued version of the theorem [37]), and not needed in the two-valued case [45,50]. It relates to the total boundedness claim in Theorem 6.1, and features also in the van Benthem theorem, where in fact it is needed also in the two-valued case [52]; indeed, proofs of the original van Benthem theorem start by assuming, in that case w.l.o.g., that there are only finitely many propositional atoms and relational modalities. In our running examples, only the ones featuring metric transition systems are affected by this assumption; indeed, for our theorems to apply to such systems, the space of labels needs to be finite.

Theorem 6.1 is proven by induction on n and most of Section 6 is devoted to the inductive step (the base case n = 0 is immediate from d<sup>K</sup> <sup>0</sup> <sup>=</sup> <sup>d</sup><sup>L</sup> <sup>0</sup> = 0). We fix a coalgebra α: A → T A and an integer n > 1 and assume as the inductive hypothesis that the three items of Theorem 6.1 have already been proven for all m<n. We show Item 1 in Lemma 6.3, Item 2 in Lemma 6.6, and Item 3 in Lemma 6.7.

**Lemma 6.3.** *We have* d<sup>K</sup> <sup>n</sup> = d<sup>L</sup> <sup>n</sup> *on* A*.*

*Proof (sketch).* We use the alternative formula for the Kantorovich lifting as given in Lemma 5.5. By Item 3 of the inductive hypothesis, and because the predicate liftings are nonexpansive, the maps λ(f) ◦ α with f ∈ Pred(A, d<sup>n</sup>−<sup>1</sup>) can be approximated using formula expansions λψ with <sup>ψ</sup> ∈ L<sup>Λ</sup> <sup>n</sup>−<sup>1</sup>.

Having shown that d<sup>K</sup> <sup>n</sup> = d<sup>L</sup> <sup>n</sup> , from now on we simply use d<sup>n</sup> to denote both. To show that d<sup>n</sup> is totally bounded, we make use of the following version of the Arzel`a-Ascoli theorem [23, Theorem 4.13].

**Lemma 6.4 (Arzel`a-Ascoli).** *Let* (X, d1) *and* (Y, d2) *be totally bounded* V*continuity spaces. Then the space* (X, d1) →<sup>1</sup> (Y, d2) *is also totally bounded.*

Using Lemma 6.4, we show that the Kantorovich lifting preserves total boundedness; this generalizes a previous result for the case V = [0, 1] [37, Proposition 29], which in turn generalizes [57, Lemma 5.6].

**Lemma 6.5.** *If the set* Λ *of predicate liftings is finite and* (X, d) *is a totally bounded* V*-continuity space, then* (TX, KΛ(d)) *is totally bounded.*

The following is now an easy consequence:

**Lemma 6.6.** *The space* (A, dn) *is totally bounded.*

Finally, we show that the modal formulae up to depth n form a dense subspace of the space of all nonexpansive properties:

**Lemma 6.7.** *Let* f ∈ Pred(A, dn) *be a nonexpansive map and let* ε 8 0*. Then there exists some modal formula* <sup>ϕ</sup> ∈ L<sup>Λ</sup> <sup>n</sup> *such that* d∨,s <sup>V</sup> (f, <sup>ϕ</sup>) <sup>≤</sup> <sup>ε</sup>*.*

*Proof (sketch).* We use the fact that for all x, y ∈ A

$$f(x) = \bigwedge\_{y \in A} d\_n(x, y) \oplus f(y) = \bigwedge\_{y \in A} (\bigvee\_{\gamma \in \mathcal{L}\_n^A} \|\gamma\|(x) \ominus \|\gamma\|(y)) \oplus f(y).$$

The latter term can be approximated using formulae of <sup>L</sup><sup>Λ</sup> <sup>n</sup> , where the infimum over <sup>y</sup> and the supremum over <sup>γ</sup> are made finite using <sup>ε</sup>-covers of <sup>A</sup> and <sup>L</sup><sup>Λ</sup> <sup>n</sup> .

Having shown that behavioural distance and logical distance coincide at all finite depths, we are now equipped to prove our first main result, a version of the Hennessy-Milner theorem stating that behavioural distance and logical distance coincide not only at finite depths (Theorem 6.1.1), but in fact also at unbounded depth. In general, this equivalence of distances can only be expected to hold if the functor T in question is *finitary*, or admits approximation by a finitary subfunctor [56]. The functor T is finitary if for all sets X and all t ∈ T X there exists a finite subset Y ⊆ X such that t = T i(s) for some s ∈ T Y , where i: Y → X is set inclusion. Examples of finitary functors include the *finite powerset functor* P<sup>ω</sup>X = {Y ⊆ X | Y finite} and the *finite subdistribution functor* S<sup>ω</sup> which maps a set X to the set of finitely supported probability subdistributions on X. K¨onig and Mika-Michalski [37] prove a quantitative coalgebraic Hennessy-Milner theorem for the case of the co-quantale [0, 1]. We generalize their result as follows:

**Definition 6.8.** We say that the value co-quantale V is *continuous from below* if for every monotone increasing sequence (an)n<ω in V and every ε 8 0, there exists some n such that a<sup>n</sup> ⊕ ε ≥ n<ω an.

This condition essentially allows the use of epsilontic arguments also for joins of increasing sequences, while value co-quantales in general allow this only for meets. It holds in all our running examples.

**Theorem 6.9 (Quantified Hennessy-Milner theorem).** *Let* Λ *be a finite set of monotone and nonexpansive predicate liftings, let* T *be a finitary functor and let* V *be a totally bounded value co-quantale that is continuous from below. Then we have* d<sup>K</sup> = d<sup>L</sup>*.*

*Proof (sketch).* Because <sup>V</sup> is continuous from below, we have <sup>K</sup>Λ(d<sup>K</sup> <sup>ω</sup> ) = n<ω KΛ(d<sup>K</sup> <sup>n</sup> ) on finite sets, and as T is finitary, this also holds for all sets. This implies that d<sup>K</sup> <sup>ω</sup> = d<sup>K</sup> <sup>ω</sup>+1 <sup>=</sup> <sup>d</sup><sup>K</sup>, so that

$$d^K = \bigvee\_{n < \omega} d^K\_n = \bigvee\_{n < \omega} d^L\_n = d^L. \tag{7}$$

Besides examples already covered by the [0, 1]-valued version of the theorem [37], this result instantiates, e.g., to a quantitative Hennessy-Milner theorem for convex-nondeterministic metric modal logic (Example 4.4.4).

### **7 Locality and Modal Characterization**

We proceed to establish our main result, the quantitative coalgebraic van Benthem theorem. The main tool in the proof of this result is a notion of *locality*, which characterizes formulae that only depend on the structure of the model in some neighbourhood of the state under consideration. This poses a challenge when it comes to coalgebraic models, as these need not come with a built-in graph structure that could be used to define what it means for two states to be neighbouring. To solve this, we make use of a technique based on *supported* coalgebras that has previously been used in the proof of a two-valued coalgebraic van Benthem theorem [52].

Recall from Section 2 that we assume T∅ = ∅. We fix an element ⊥ ∈ T∅, and for each set A put ⊥<sup>A</sup> = T i(⊥), where i: ∅ → A is the empty map.

**Definition 7.1 (Support).** Let A be a set. We say that a set B ⊆ A is a *support* of t ∈ T A if t ∈ T B. A *supported coalgebra* is a coalgebra α: A → T A together with a map supp<sup>α</sup> : A → PA such that suppα(a) is a support of α(a) for every a ∈ A.

Every coalgebra can be supported because we can always put suppα(a) = A for all a ∈ A. Supporting a coalgebra equips it with a graph structure:

**Definition 7.2 (Neighbourhood).** Let A = (A, α,suppα) be a supported coalgebra.


For any k<ω and any state a in a supported coalgebra A = (A, α,suppα), we can define a supported coalgebra <sup>A</sup><sup>k</sup> <sup>a</sup> = (U<sup>k</sup>(a), α<sup>k</sup>,supp<sup>α</sup><sup>k</sup> ) on the radius<sup>k</sup> neighbourhood of <sup>a</sup>. The coalgebra map <sup>α</sup><sup>k</sup> : <sup>U</sup><sup>k</sup>(a) <sup>→</sup> <sup>T</sup>(U<sup>k</sup>(a)) is given by <sup>α</sup><sup>k</sup>(b) = <sup>α</sup>(b) if suppα(b) <sup>⊆</sup> <sup>U</sup><sup>k</sup>(a) and <sup>α</sup><sup>k</sup>(b) = <sup>⊥</sup><sup>A</sup> otherwise. We note that the latter case only occurs for states on the edge of U<sup>k</sup>(a), that is when Dsupp (a, b) = k. Note that ⊥<sup>A</sup> has empty support by construction, so that we can put supp<sup>α</sup><sup>k</sup> (b) = ∅ in this latter case and supp<sup>α</sup><sup>k</sup> (b) = suppα(b) otherwise.

Using the neighbourhood around a state and the coalgebra structure defined on it, we can now define our notion of locality:

**Definition 7.3.** A formula <sup>ϕ</sup> is <sup>k</sup>*-local* if we have <sup>ϕ</sup><sup>α</sup>(a) = <sup>ϕ</sup><sup>α</sup><sup>k</sup> (a) for all supported coalgebras A = (A, α,suppα) and all a ∈ A.

**Lemma 7.4.** *For every supported coalgebra* A = (A, α,suppα)*,* k<ω *and* a ∈ A*, we have* dK,s <sup>k</sup> (a, a) = <sup>0</sup>*, where the first* <sup>a</sup> *lives in* <sup>A</sup> *and the second in* <sup>A</sup><sup>k</sup> a*.*

A key step in the proof is the following locality result, which in similar form appears also in proofs of the classical van Benthem theorem [44], and is proved, in our case, by a game-theoretic method that is related to classical Ehrenfeucht-Fra¨ıss´e games:

**Lemma 7.5.** *Let* ϕ(x) *be a behaviourally nonexpansive formula with* qr(ϕ) ≤ n*. Then* ϕ *is* k*-local for* k = 3<sup>n</sup>*.*

*Proof (sketch).* Consider a spoiler-duplicator game over n rounds, where both players place a pebble every round and the second player needs to maintain the invariant that if there are m rounds remaining the radius 3<sup>m</sup> neighbourhoods around the pebbles need to be isomorphic. One can show that this invariant guarantees equivalence on formulae of rank at most m.

We use this game to prove for every supported coalgebra A that ϕ has the same value on <sup>A</sup> and <sup>A</sup><sup>k</sup> <sup>a</sup>. Nonexpansiveness of ϕ is used to extend the two coalgebras in such a way that the duplicator always has a suitable response.

We next show that every nonexpansive formula that is local is also nonexpansive at some finite depth. We make use of an unravelling construction, where a coalgebra is enlarged so that the successors of every state in the unravelling (as given by the support relation) form a tree.

**Definition 7.6 (Unravelling).** The *unravelling* of a supported coalgebra <sup>A</sup> = (A, α,suppα) is the supported coalgebra <sup>A</sup><sup>∗</sup> = (A<sup>+</sup>, α∗,supp<sup>α</sup><sup>∗</sup> ), where <sup>A</sup><sup>+</sup> is the set of nonempty sequences over <sup>A</sup> and for <sup>a</sup><sup>1</sup> ...a<sup>n</sup> <sup>∈</sup> <sup>A</sup><sup>+</sup> we have α (a<sup>1</sup> ...an) = T f(α(an)) and supp<sup>α</sup><sup>∗</sup> (a<sup>1</sup> ...an) = f[suppα(an)], where <sup>f</sup> : <sup>A</sup> <sup>→</sup> <sup>A</sup><sup>+</sup>, a <sup>→</sup> <sup>a</sup><sup>1</sup> ...ana.

**Lemma 7.7.** *For every supported coalgebra* A = (A, α,suppα) *and every* a ∈ A*, we have* <sup>d</sup>K,s(a, a) = <sup>0</sup>*, where the first* <sup>a</sup> *lives in* <sup>A</sup> *and the second in* <sup>A</sup><sup>∗</sup>*.*

The mentioned nonexpansiveness at finite depth follows:

**Lemma 7.8.** *Let* ϕ *be behaviourally nonexpansive and* k*-local. Then* ϕ *is also depth-*k *behaviourally nonexpansive.*

*Proof (sketch).* By the assumptions on ϕ we may pass from any supported coalgebra to the radius-k neighbourhood in the unravelling, which is shaped like a tree of depth k. Between any two such tree structures we have d<sup>K</sup> <sup>k</sup> = d<sup>K</sup>, as their behaviour past depth k is fully characterized by the default value ⊥ ∈ T∅.

The target result then follows by combining the above lemmas with Theorem 6.1 and a final chain argument that allows us to detach the technical development from the choice of a fixed coalgebra:

**Theorem 7.9 (Quantified van Benthem theorem).** *Let* Λ *be a finite set of monotone and nonexpansive predicate liftings, let* T *be a standard functor with* T∅ = ∅*, and let* V *be a totally bounded value co-quantale. Then for every behaviourally nonexpansive formula* ϕ *of quantitative coalgebraic predicate logic with quantifier rank at most* n *and every* ε 8 0 *there exists a modal formula* <sup>ψ</sup> ∈ L<sup>Λ</sup> *such that for all coalgebras* <sup>α</sup>: <sup>A</sup> <sup>→</sup> T A *and all* <sup>a</sup> <sup>∈</sup> <sup>A</sup>*,* ds V (<sup>ϕ</sup><sup>α</sup>(a), <sup>ψ</sup><sup>α</sup>(a)) <sup>≤</sup> <sup>ε</sup> *and the modal rank of* <sup>ψ</sup> *is bounded by* <sup>3</sup><sup>n</sup>*.*

*Proof (sketch).* Using the final chain (T <sup>n</sup>1)n<ω, where 1 is a singleton set, we can construct a coalgebra (Z, ζ) such that for all (A, α) and all ϕ, ψ we have d∨,s <sup>V</sup> (<sup>ϕ</sup><sup>α</sup>, <sup>ψ</sup><sup>α</sup>) <sup>≤</sup> <sup>d</sup>∨,s <sup>V</sup> (<sup>ϕ</sup><sup>ζ</sup> , <sup>ψ</sup><sup>ζ</sup> ).

As ϕ is behaviourally nonexpansive, we get that it is also depth-k behaviourally nonexpansive for k = 3qr(ϕ) by Lemmas 7.5 and 7.8, and by Theorem 6.1.3 for every <sup>ε</sup> <sup>8</sup> <sup>0</sup> there is <sup>ψ</sup> ∈ L<sup>Λ</sup> <sup>k</sup> such that <sup>d</sup>∨,s <sup>V</sup> (<sup>ϕ</sup><sup>ζ</sup> , <sup>ψ</sup><sup>ζ</sup> ) <sup>≤</sup> <sup>ε</sup>.

To our best knowledge, the only previously known instances of this result in the real-valued setting are the ones for [0, 1]-valued fuzzy modal logic [57] and for quantitative probabilistic modal logic [58]. In the two-valued setting, we cover a previous coalgebraic van Benthem result [52] by instantiating to V = 2, and in fact obtain an additional asymmetric version, characterizing fragments that are preserved under simulation. In our running examples, we obtain new concrete van Benthem theorems for [0, 1]-valued metric modal logic (Example 4.4.3) and convex-nondeterministic metric modal logic (Example 4.4.4). We cover, by default, the asymmetric case (to be thought of as characterizing fragments that are preserved under quantitative simulation) and, in the cases V = [0, 1] and V = 2, also the symmetric case (to be thought of as characterizing fragments that are invariant under bisimulation).

### **8 Conclusions**

We have established a highly general quantitative version of van Benthem's modal characterization theorem, stating that given a value quantale V that is totally bounded and continuous from below, all state properties, in a given type of quantitative systems, that are nonexpansive w.r.t. V-valued behavioural distance and expressible in V-valued coalgebraic (first-order) predicate logic can be approximated by V-valued modal formulae of bounded rank. A key technical tool in the proof are versions of the classical Arzela-Ascoli and Stone-Weierstraß theorems for totally bounded quantale-valued (pseudo-quasi-)metric spaces. Coalgebraic generality implies that this result not only subsumes existing quantitative van-Benthem type theorems for fuzzy [57] and probabilistic [58] systems, but we also obtain new results, e.g. for metric transition systems. Via the additional parametrization over a value quantale, we moreover obtain, e.g., a van Benthem theorem for convex-nondeterministic behavioural distance ('states x, y have distance between a and b') on metric transition systems. Our result complements previous coalgebraic results for two-valued logics [52]. We do leave some open problems, in particular to determine whether the main result can be sharpened to exact modal expressibility instead of approximability, and to obtain a quantitative modal characterization over finite models, in generalization of Rosen's finite-model variant of van Benthem's theorem [48].

**Acknowledgements** We wish to thank Barbara K¨onig for valuable discussions.

#### **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

### Author Index

Altenkirch, Thorsten 1 Ayala-Rincón, Mauricio 22 Balasubramanian, A. R. 42 Baldan, Paolo 62 Bednarczyk, Bartosz 82 Bonchi, Filippo 102 Bose, Sougata 124 Boulier, Simon 1 Bravetti, Mario 144 Czerner, Philipp 164 Dixon, Alex 184 Eggert, Richard 62 Esparza, Javier 42 Faggian, Claudia 205 Fernández, Maribel 22 Fukihara, Yōji 226 Gheorghiu, Alexander 247 Ghilardi, Silvio 268 Ghiorzi, Enrico 324 Gianola, Alessandro 268 Graulund, Christian Uldal 289 Guerrieri, Giulio 205 Haase, Christoph 310 Jaax, Stefan 164 Jeffries, Daniel 324 Johann, Patricia 324 Kaposi, Ambrus 1 Kappé, Tobias 510 Kapur, Deepak 268 Katsumata, Shin-ya 226 Kesner, Delia 344 Klin, Bartek 365 König, Barbara 62

Krishna, S. N. 124 Krishnaswami, Neel 289 Kupferman, Orna 385 Kura, Satoshi 406 Lange, Julien 144 Lasota, Sławomir 365 Lazić, Ranko 184 Marin, Sonia 247 Mayr, Richard 427 Michaliszyn, Jakub 82 Milius, Stefan 448 Murawski, Andrzej S. 184 Muscholl, Anca 124 Myers, Robert S. R. 448 Nantes-Sobrinho, Daniele 22 Padoan, Tommaso 62 Peyrot, Loïc 344 Piedeleu, Robin 469 Puppis, Gabriele 124 Raskin, Mikhail 42 Rot, Jurriaan 510 Różycki, Jakub 310 Santamaria, Alessio 102 Sattler, Christian 1 Schewe, Sven 427 Schmidt, Jonas 490 Schröder, Lutz 551 Schwentick, Thomas 490 Sestini, Filippo 1 Sickert, Salomon 385 Silva, Alexandra 510 Szamozvancev, Dmitrij 289 Tantau, Till 490 Toruńczyk, Szymon 365

Totzke, Patrick 427

Urbat, Henning 448

Vale, Deivid 22 van Heerdt, Gerco 510 Ventura, Daniel 344 Vilmart, Renaud 531 Vortmeier, Nils 490

Walukiewicz, Igor 184 Wild, Paul 551 Wojtczak, Dominik 427

Zanasi, Fabio 469 Zavattaro, Gianluigi 144 Zeume, Thomas 490